Simplify MalSimulator and create wrappers for other interfaces #87

mrkickling · 2024-12-17T13:33:03Z

This is a rather big redesign of the simulator.

MalSimulator is now very much simplified:

This creates a base for other wrappers (with their own observation formats) to use
It also allows to run the simulator more easily for other reasons than ML training
It works on AttackGraphs only
It stores agent states (Agent classes were added for this)
The MAL Simulator contains the state of the Attack Graph
step() takes AttackGraphNodes as actions and returns those that were performed (not the full state). Does this make sense?

Things related to building up observations (specifically the ParallelEnv part that before was in the MalSimulator) has now been moved to a wrapper malsim_vectorized_obs_env.py (naming here is an open question).
This wrapper builds up state step by step from the performed actions returned from MalSimulator.step(actions).
Logging for that wrapper was also factored into its own module (should it be?)

Added the class DecisionAgent (these are agents like BFS/DFS + keyboard + in the future more advanced heuristics).
This was to create a common interface for working with DecisionAgents.
All our DecisionAgents are still not done, they are still tailored towards the ParallelEnv/MalSim Vectorized Obs Env, but the goal is to have them work with the regular MalSimulator (on attack graphs state).

Any number of agents can now be specified in the scenario file. This is a cleaner solution in my opinion.

Let me know if something does not make sense.

kasanari · 2025-01-15T14:44:22Z

malsim/sims/mal_simulator.py

-        if (reprint_header != 0) and ((entry_nr + 1) % reprint_header == 0):
-            formatted_str += header
-    return formatted_str
+@dataclass


Having this be a non-mutable NamedTuple would be safer IMO

The dataclass can also become immutable with @dataclass(frozen=True).

Sorry, why do we want them to be immutable?
The agent states are (at least in the current implementation) stored in the simulator and potentially updated/changed each step.

Are they not also sent outside the simulator, as per the function signature of the searchers? Sending references to mutable state outside of the simulator seems dangerous to me.

I see your point now. As you say the agent states are not only internal, but used by the decision agents/searchers. I agree that it feels dangerous. Maybe they should be immutable, but it also feels a bit weird to create new Agent objects in every step. Other option is to use something else in the searchers than those objects, and keep them internal.

You can do:

@property def state(self): """Immutable state""" return tuple(self._mutable_state)

or sth similar.

But the current implementation is not safe in this manner in a thousand places. It is also very easy for someone to bypass these safety measures. If it is too cumbersome to create such a dynamic property (because converting the internal data structure - a dataclass in this case - to sth immutable is more involved), then it is not worth it. In this case, the simplest approach seems to be to convert the dataclass to a frozen one when reading the property. But I don't know the performance implications.

Yes, having the agent state object be mutable internally in the MalSimulator but return an immutable object when fetching it might be the easiest way. It would avoid the dangers and confusion of the agent states being changed outside of the simulator. Does that make sense?

That seems like an appropriate solution.

I added MalSimAgentView (a public interface for MalSimAgent). So every time an agent state is returned from MalSimulator it will be read-only.

The observations/infos in the SerializedObsEnv are now created without altering the MalSimAgent objects and instead stored in dicts.

Bonus explanation about the searcher agents:
The searchers still depend on action_mask (which comes from the SerializedObsEnv) which is not what I/we actually want in the end (we want the decision agents to work with the MalSimulator without serialization/indexing/action masks). The problem was that when I converted the searcher agents to use the agent state only they behaved a little bit different from before, so before I solve that I will let them stay that way.

EDIT: last paragraph is not relevant any more, the searchers are now 'converted'

Hoclor · 2025-01-17T13:40:43Z

malsim/agents/searchers.py

@@ -81,7 +85,7 @@ def select_next_target(
        return current_target, False


-class DepthFirstAttacker:
+class DepthFirstAttacker():


Shouldn't the DepthFirstAttacker also inherit DecisionAgent?

Yes, thank you. I just pushed quite a big change to the searcher agents (they will use attackgraph nodes instead of the serialized action masks) where I fixed that as well.

nkakouros · 2025-01-17T14:12:17Z

malsim/agents/decision_agent.py

How about we consolidated the design of the agents taking into account my PR here too? Otherwise, if you plan to merge that too, it does not make sense to have them redesigned here only to be redesigned again later.

I had not seen your PR.

In my opinion the agents should not depend on the obs/action mask. Partly because it is not needed and partly because the obs/action mask will be specific to the MalSimSerializedEnv/ParalellEnv and not a part of the base MalSimulator (as per this PR).

The searcher agents in this PR only depend on the action surface of an agent (a list of AttackGraphNodes). I am open to other ways and of simplifying the searcher agents further, but I do want to get rid of using the obs/action mask. What do you think?

I'm all for removing these cryptic stuff. My comment was mostly about the design of the agents which seems unnecessarily convoluted in its current form. There is really no need for 4 5 methods just to pass things around or 5 variables to do what can be done with 2, etc.

Specifically to your changes here, is there an ambition behind the Decision agent? To me, as long as the interface of an agent is fixed and explicit (i.e. the init and the get-next-step-what-its-called-method) there is really no benefit in providing abstract classes or trying to standardize the internal structure of an agent class, especially when it is that little. Or are there requests for sth like that by others that make such things a necessity or sth?

My approach would be to take my PR (because I think it's great 😃) and just remove the mask etc, unless I'm missing sth bigger.

Also, I think it doesn't offer much to have three files to keep what is basically 200 lines of code. The perceived benefit of having the agent module be a package where things are split using a conceptual separation does not add much in practice, it makes importing a tiny bit slower, requires 3 open files if someone is editing the package in an editor, etc, and more importantly is undone by exposing classes at the package level using __all__ in __init__.

Yeah, I am all for simplifying it into one or two method if it is possible. I also added two simple tests so that makes it safer to make those changes.

If there is an ambition behind using the abstract base class DecisionAgent?
The main reason is that it makes it clear for the user/developer that every class that implements it shares the same interface and has the same method name. I agree that it might be a bit overkill since there will probably just be one method.. but at the same time I don't think it hurts, especially when using a linter it gets clear that for example in cli.py we can use decision_agent.get_next_action(sim_agent_state) even though we don't know what class decision_agent has (we type hint it with DecisionAgent).

Maybe it is not super pythonic, and maybe there is some better option? Or do you think it adds no value?

I didn't see the second comment before sending that.

I think having separated files makes sense in this case for future proofing instead of putting all agents in one class. We will probably add heuristics agents in there (agents that take decision based on the state of the attack graph) that could be in a separate file and we might add searcher agents which are specific for defenders as well, so it will grow.

But yeah, if it makes sense to have a base DecisionAgent is more questionable.

kasanari · 2025-01-17T15:40:23Z

Throwing this into the mix:

# Author: Jakob Nyberg, 2025
from collections.abc import Callable
import logging


from typing import Any

import numpy as np
import numpy.typing as npt

logger = logging.getLogger(__name__)


Array = npt.NDArray[np.int32]
BoolArray = npt.NDArray[np.bool_]


def get_new_targets(
    discovered_targets: npt.NDArray[np.int32],
    mask: tuple[npt.NDArray[np.bool_], npt.NDArray[np.bool_]],
) -> tuple[npt.NDArray[np.int32], npt.NDArray[np.int32]]:
    attack_surface = mask[1]
    surface_indexes = np.flatnonzero(attack_surface)
    new_targets = np.array(
        [idx for idx in surface_indexes if idx not in discovered_targets],
        dtype=np.int32,
    )
    return new_targets, surface_indexes


def move_target_to_back(
    current_target: np.int32 | None,
    targets: npt.NDArray[np.int32],
    attack_surface: npt.NDArray[np.int32],
) -> tuple[npt.NDArray[np.int32], np.int32 | None]:
    """
    If the current target was not compromised this turn, put it
    on the bottom of the stack and focus on next target instead
    """

    if not current_target:
        return targets, current_target

    if current_target in attack_surface:
        targets = np.concatenate((current_target, targets[:-1]))
        return targets, targets[-1]

    return targets, current_target


def choose_target(
    targets: npt.NDArray[np.int32],
    attack_surface: npt.NDArray[np.int32],
) -> tuple[npt.NDArray[np.int32], np.int32, bool]:
    # targets that have not been removed from the attack surface by another agent
    valid_targets = np.array(
        [t for t in targets if t in attack_surface], dtype=np.int32
    )

    if len(valid_targets) == 0:
        return valid_targets, np.int32(0), True

    return valid_targets[:-1], valid_targets[-1], False


def compute_action(
    permute_func: Callable[[Array], Array],
    action_func: Callable[
        [np.int32 | None, Array, Array], tuple[Array, np.int32 | None, bool]
    ],
    add_targets_func: Callable[[Array, Array], Array],
):
    def _compute_action(
        targets: Array,
        mask: tuple[BoolArray, BoolArray],
        current_target: np.int32 | None,
    ):
        new_targets, surface_indexes = get_new_targets(targets, mask)

        targets, current_target, done = action_func(
            current_target,
            add_targets_func(targets, permute_func(new_targets)),
            surface_indexes,
        )

        action = 0 if done else 1
        if action == 0:
            logger.debug(
                'Attacker agent does not have any valid targets and will terminate'
            )

        logger.debug(f'Attacker targets: {targets}')
        logger.debug(f'Attacker current target: {current_target}')
        logger.debug(f'Attacker action: {action}')
        return targets, action, current_target

    return _compute_action


def create_permute_func(seed: int | None, randomize: bool) -> Callable[[Array], Array]:
    s = seed if seed else np.random.SeedSequence().entropy
    rng = np.random.default_rng(s) if randomize else None
    return rng.permutation if rng else lambda x: x


class BreadthFirstAttacker:
    def __init__(self, agent_config: dict[str, Any]) -> None:
        self.current_target: np.int32 | None = None
        self.targets: npt.NDArray[np.int32] = np.array([], dtype=np.int32)

        permute_func = create_permute_func(
            agent_config.get('seed', None), agent_config.get('randomize', False)
        )

        self.compute_action = compute_action(
            permute_func, self._action_func, self._add_new_targets_func
        )

    def __call__(
        self,
        _: dict[str, Any],
        mask: tuple[npt.NDArray[np.bool_], npt.NDArray[np.bool_]],
    ):
        self.targets, action, self.current_target = self.compute_action(
            self.targets, mask, self.current_target
        )

        return (action, self.current_target)

    @staticmethod
    def _action_func(
        current_target: np.int32 | None,
        targets: Array,
        surface_indexes: Array,
    ):
        targets, current_target = move_target_to_back(
            current_target, targets, surface_indexes
        )
        return choose_target(targets, surface_indexes)

    @staticmethod
    def _add_new_targets_func(targets: Array, new_targets: Array):
        new_targets = np.flip(new_targets)  # to comply with the original implementation
        return np.concatenate([new_targets, targets])


class DepthFirstAttacker:
    def __init__(self, agent_config: dict[str, Any]) -> None:
        self.current_target: np.int32 | None = None
        self.targets: npt.NDArray[np.int32] = np.array([], dtype=np.int32)

        permute_func = create_permute_func(
            agent_config.get('seed', None), agent_config.get('randomize', False)
        )

        self.compute_action = compute_action(
            permute_func, self._action_func, self._add_new_targets_func
        )

    def __call__(
        self,
        _: dict[str, Any],
        mask: tuple[npt.NDArray[np.bool_], npt.NDArray[np.bool_]],
    ):
        self.targets, action, self.current_target = self.compute_action(
            self.targets, mask, self.current_target
        )

        return (action, self.current_target)

    @staticmethod
    def _action_func(
        current_target: np.int32 | None,
        targets: Array,
        surface_indexes: Array,
    ):
        # keep working on a target unless it has been removed from the attack surface
        return (
            choose_target(targets, surface_indexes)
            if current_target not in surface_indexes
            else (targets, current_target, False)
        )

    @staticmethod
    def _add_new_targets_func(targets: Array, new_targets: Array):
        # add new targets to the front of the list, so that the agent works on the latest targets first
        return np.concatenate([targets, new_targets])

nkakouros · 2025-01-20T18:52:49Z

malsim/sims/mal_sim_agent.py

+
+
+@dataclass
+class MalSimAgent:


I think calling this MalSimAgent is confusing. This is a container for agent-related info. A name like MalSimAgentState or MalSimAgentState seems more representative. The current name adds one more "agent" to the actual agent classes and the maltoolbox attacker.

Yes, this confusion is very annoying, and I think you are right about just calling it MalSimAgentState instead. Will change that.

nkakouros · 2025-01-20T18:54:01Z

tests/sims/test_example_scenarios.py

Can you please explain how this test works? Why doesn't it use searcher agents? BFS is mentioned here but you seem to read the attackers from the toolbox?

It doesn't use any searcher agents because I am not testing the searcher agents, but just the mal simulator functionality. I perform the steps one at a time with steps that I already know are available to the defender/attacker (as per the model and the entrypoints). The scenario file will probably help to see the entrypoints.

Yes, all attacker agents in the simulators in the MalSimulator must reference an Attacker object. This is so the simulator can know the entrypoints, reached attack steps and perform 'compromise' on nodes. This is nothing new in my PR though.

…et_next_action

… vectorized simulator MALSimulator handles: - agent registration - simulation step - action masks - state of attack graph But it does NOT handle the observation format/vectorization. Other changes: - Agents state are not dicts but instead MalSimAgent objects - Add vectorized separate simulator

- Instead use field 'agents' to define any number of agents in scenario settings

…n the simulator - MalSimAgent is now only internally used in the MalSimulator - Every time an agent state is returned from the simulator it is converted to a MalSimAgentView (read only)

- Don't alter MalSimAgentView objects - Instead store obs/infos in dicts that are returned on step/reset

… of code base - As a temporary solution to not change behaviour of searcher agents I decided to still use the action mask and not the agent state in them. - End goal is to not use that kwarg so I added TODO for that - Adapt the cli/envs/tests to work with the new searcher agents

…erentiate with malsimagent

… action mask - And adapt everywhere where the searcher decision agents are used (don't give it kwarg anymore, just MalSimAgentView)

…apt cli to that

…not fail

…ails sometimes

…face

kasanari · 2025-01-29T14:13:49Z

malsim/agents/searchers.py

+
+        # action surface does not have a guaranteed order,
+        # so for the agent to be deterministic we need to sort
+        action_surface.sort(key=lambda n: n.id)


If the order of the action surface can not be guaranteed to be consistent (I personally think the simulator should handle such matters), then why typehint it as a list rather than a set?

mrkickling force-pushed the add-base-mal-simulator branch 3 times, most recently from aebfcb8 to 753ce55 Compare December 20, 2024 12:33

mrkickling force-pushed the add-base-mal-simulator branch 2 times, most recently from 502b688 to fe27e46 Compare January 15, 2025 13:52

mrkickling marked this pull request as ready for review January 15, 2025 14:18

mrkickling force-pushed the add-base-mal-simulator branch from ce89c22 to 950fed4 Compare January 15, 2025 14:22

mrkickling requested a review from andrewbwm January 15, 2025 14:25

mrkickling changed the title ~~Add base mal simulator~~ Simplify MalSimulator and create wrappers for other interfaces Jan 15, 2025

mrkickling requested review from kasanari and nkakouros January 15, 2025 14:32

kasanari reviewed Jan 15, 2025

View reviewed changes

mrkickling requested a review from Hoclor January 15, 2025 14:45

Hoclor reviewed Jan 17, 2025

View reviewed changes

nkakouros reviewed Jan 17, 2025

View reviewed changes

mrkickling force-pushed the add-base-mal-simulator branch from d1b0cf8 to 2fe03ed Compare January 17, 2025 15:07

nkakouros reviewed Jan 20, 2025

View reviewed changes

This was referenced Jan 22, 2025

Update scenario format mal-lang/malsim-scenarios#4

Open

Adapt to maltoolbox 0.2.0 #94

Closed

mrkickling force-pushed the add-base-mal-simulator branch from bf769ab to 1fa8373 Compare January 27, 2025 12:10

mrkickling added 8 commits January 29, 2025 09:48

Make malsim.agents a package and rename compute_action_from_dict to g…

17eb62e

…et_next_action

Deprecate attacker_agent_class and defender_agent_class from scenario

a8a95c7

- Instead use field 'agents' to define any number of agents in scenario settings

Update CLI to use new simulator structure

d27d913

Update README with new scenario format

3a6116a

Update Attacker/DefenderEnv to use new MalSim and scenario format

458aedd

Update testdata and tests to work with new scenario format and simulator

7efeb13

Inherit DecisionAgent in all decision agents

30e495c

mrkickling and others added 24 commits January 29, 2025 09:50

Make agents_dict private _agents_dict

19461c4

_initialize_rewards -> _init_agent_rewards

a620a9c

Add MalSimAgentView - a public read only interface for agent states i…

5380e7d

…n the simulator - MalSimAgent is now only internally used in the MalSimulator - Every time an agent state is returned from the simulator it is converted to a MalSimAgentView (read only)

Adapt MalSimVectorizedObsEnv to new MalSimAgentView interface

f7b3465

- Don't alter MalSimAgentView objects - Instead store obs/infos in dicts that are returned on step/reset

Fix bug: defenders need to perform action before attacker

ee5166c

Fix log stmt

59fc665

rename module agents.base_agent->agents.decision_agent to easier diff…

f7e2c1f

…erentiate with malsimagent

Rewrite searcher agents to use AttackGraphNodes instead of serialized…

e0bb5d7

… action mask - And adapt everywhere where the searcher decision agents are used (don't give it kwarg anymore, just MalSimAgentView)

Add types to MalSimAgentView

48e1722

Add tests for searcher agents

2e95bd2

Fix return types decision agents, make KeyboardAgent return nodes, ad…

287bfa7

…apt cli to that

Remove unused variable

3ea116a

Have CLI use MalSimulator instead of SerializedObsEnv

1f0816e

Fix cli test

173af1d

Remove unused vars

6b4c722

Rename MalSimAgent->MalSimAgentState

cc1e09f

Simplify example scenario test

cdacb57

Rename: get_agent -> get_agent_state

c08e768

typing

7d6a050

Make sure seed is sent to agents when running MalSimVectorizedObsEnv

84e9b96

Simplify agents

58309e3

Update malsim vector obs env to abide to new language graph

f3d7433

Give AttackGraphpNode lang_graph_attack_step in tests the tests does …

45a1182

…not fail

mrkickling force-pushed the add-base-mal-simulator branch from d208cd3 to 45a1182 Compare January 29, 2025 08:58

mrkickling added 2 commits January 29, 2025 11:53

Update test to work with new mal simulator - but for some reason it f…

57a2d8e

…ails sometimes

Make sure searcher agents are deterministic by sorting the action sur…

a917c6a

…face

kasanari reviewed Jan 29, 2025

View reviewed changes

mrkickling mentioned this pull request Jan 29, 2025

Add heuristic defender decision agent #95

Draft

Allow to give agent config when loading scenario

ef24d88

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify MalSimulator and create wrappers for other interfaces #87

Simplify MalSimulator and create wrappers for other interfaces #87

mrkickling commented Dec 17, 2024 •

edited

Loading

kasanari Jan 15, 2025

nkakouros Jan 15, 2025

mrkickling Jan 15, 2025

kasanari Jan 16, 2025

mrkickling Jan 16, 2025

nkakouros Jan 16, 2025

mrkickling Jan 16, 2025

kasanari Jan 16, 2025

mrkickling Jan 17, 2025 •

edited

Loading

Hoclor Jan 17, 2025

mrkickling Jan 17, 2025

nkakouros Jan 17, 2025 •

edited

Loading

mrkickling Jan 17, 2025

nkakouros Jan 17, 2025

nkakouros Jan 17, 2025

mrkickling Jan 17, 2025

mrkickling Jan 17, 2025

kasanari commented Jan 17, 2025 •

edited

Loading

nkakouros Jan 20, 2025

mrkickling Jan 20, 2025

mrkickling Jan 21, 2025

nkakouros Jan 20, 2025

mrkickling Jan 20, 2025

kasanari Jan 29, 2025 •

edited

Loading

Simplify MalSimulator and create wrappers for other interfaces #87

Are you sure you want to change the base?

Simplify MalSimulator and create wrappers for other interfaces #87

Conversation

mrkickling commented Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrkickling Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nkakouros Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kasanari commented Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kasanari Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

mrkickling commented Dec 17, 2024 •

edited

Loading

mrkickling Jan 17, 2025 •

edited

Loading

nkakouros Jan 17, 2025 •

edited

Loading

kasanari commented Jan 17, 2025 •

edited

Loading

kasanari Jan 29, 2025 •

edited

Loading