A question about {} -> {} syntax in Learning queries of UPPAAL #227

Szpilman2 · 2023-10-17T07:09:56Z

Szpilman2
Oct 17, 2023

Dear All
Hi,
I was reading the UPPAAL tutorial and in learning queries I saw this strange syntax and after two days searching on the web I couldn't understand this concept.
suppose this Query:
minE(cost) [<=10] { i, j } -> { d, f } : <> goal
What is the meaning of i,j,d,f in this query? and why should we use it in our query? Actually I don't know its meaning and I am frustrated.
Could you please help me? Are they variables? Is there a paper that explains it well? I have read the online documentation more than 100 times and unfortunately I didn't understand it.

Best wishes
Peyman.

Answered by petergjoel

Oct 17, 2023

Hi Peyman,

The variables i, j, d, f relates to the observable part of the state-space (to the learning agent).

As an example, consider the bouncing ball in Figure 4 in Teaching Stratego to Play Ball.

We may want to derive a controller that is ignorant to the velocity of the ball (as this might be hard to estimate in the application), we can describe this delimitation as follows:

minE(cost) [<=10] {} -> {h}

where h is the height of the ball.

This is opposed to

minE(cost) [<=10] {} -> {h, v}

which observes the full state of the ball.

Theoretically this induces an Partially Observable EMDP for which Q-learning is not garuanteed to converge. However, it does (at least to a usefull degree) f…

View full answer

petergjoel · 2023-10-17T12:48:34Z

petergjoel
Oct 17, 2023
Collaborator

Hi Peyman,

The variables i, j, d, f relates to the observable part of the state-space (to the learning agent).

As an example, consider the bouncing ball in Figure 4 in Teaching Stratego to Play Ball.

We may want to derive a controller that is ignorant to the velocity of the ball (as this might be hard to estimate in the application), we can describe this delimitation as follows:

minE(cost) [<=10] {} -> {h}

where h is the height of the ball.

This is opposed to

minE(cost) [<=10] {} -> {h, v}

which observes the full state of the ball.

Theoretically this induces an Partially Observable EMDP for which Q-learning is not garuanteed to converge. However, it does (at least to a usefull degree) for many practical cases, see e.g. Playing Wordle with Uppaal Stratego where we use this heavily.

Difference between {i,j} and {d,f}

The difference between the left-hand and right-hand side relates to the partition-refinement method described in Teaching Stratego to Play Ball.

The observation-space {d,f} will be generalized and refined over automatically by the tool -- essentially creating a function-approximator over the space for the Q-value.

Observations in the {i,j}-space are treated as entries into a table -- very similar to classical Q-learning with tabulation.

In fact, if you only add observations in the left-hand side ({i,j}) you will be using classical tabulated Q-learning.

If you only place observation in the right-hand-side, you will be training function approximations (think neural network, but different technology).

If you mix, you will create a table (over {i,j}) of function approximators (over {d,f}).

So taking the bouncing ball as an example:

minE(cost) [<=10] {floor(v)} -> {h}

trains an agent that for each (integral) velocity will have a function-approximator over h.

Why not always use the {i,j} side?

The sample complexity (i.e. effort needed to get decent controllers) is conjectured to be exponential in the dimensions.

Why not always use the {d,f} side?

The table is only valid for observations seen during training, so there might be holes in your table.
You generally want to restrict the use of this side to observations with a small sample-space -- for which you are sure to visit all during training.

So be VERY careful with doubles on this side.

Hope that helps.

2 replies

mikucionisaau Oct 18, 2023
Maintainer

The variables i, j, d, f can also be arbitrary expressions, right?

Szpilman2 Oct 19, 2023
Author

thank you dear @petergjoel, you explained all the things clear and great.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPPAAL

A question about {} -> {} syntax in Learning queries of UPPAAL #227

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

UPPAAL

A question about {} -> {} syntax in Learning queries of UPPAAL #227

Szpilman2 Oct 17, 2023

Replies: 1 comment · 2 replies

petergjoel Oct 17, 2023 Collaborator

Difference between {i,j} and {d,f}

Why not always use the {i,j} side?

Why not always use the {d,f} side?

mikucionisaau Oct 18, 2023 Maintainer

Szpilman2 Oct 19, 2023 Author

Szpilman2
Oct 17, 2023

Replies: 1 comment 2 replies

petergjoel
Oct 17, 2023
Collaborator

mikucionisaau Oct 18, 2023
Maintainer

Szpilman2 Oct 19, 2023
Author