diff --git a/notebooks/adversarial_search_fall_2021/.ipynb_checkpoints/index-checkpoint.ipynb b/notebooks/adversarial_search_fall_2021/.ipynb_checkpoints/index-checkpoint.ipynb new file mode 100644 index 00000000..76492f75 --- /dev/null +++ b/notebooks/adversarial_search_fall_2021/.ipynb_checkpoints/index-checkpoint.ipynb @@ -0,0 +1,80 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Introduction\n", + "### The concept of adversarial search and an introduction to Game Theory\n", + "In previous lectures, we discussed situations in which we had only a single agent. We didn't consider other parameters affecting our environment but in this chapter, a new type of search is introduced which is called **Adversarial Search**. In adversarial search, we define our problem in a multi-agent context. For instance, while playing a game, our agent has to consider the other agent's moves (adversarial moves) to play in an efficient way. Even in some games we can define a winning stategy which means we can guarantee that in every state of our game, no matter how bad it is, our agent is able to win the game.\n", + "To gather more information about the concept of adversarial search, visit this link." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Game Theory Explanation\n", + "Briefly, Game Theory is designing the strategies and the steps of our player to interact in the best way, according to the rival's steps and strategies. In other words, Game theory is the study of mathematical models of strategic interactions among rational decision-makers. To know the game theory better, you can stop by here.\n", + "To achieve this goal, we need to express our game and its rules and actions in a mathematical form. The common model used to represent games is the model below:\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Resource Limits\n", + "Although the minimax algorithm would be proper for problems with relatively small state space, it isn't an efficient and feasible one for problems with more complicated and larger state space. Since the number of game states it has to examine is exponential in the depth of the tree.\n", + "\n", + "Consider a _reasonable_ chess game with $b \\approx 35$ and $m \\approx 100$ . Due to the time complexity of this algorithm which is $O(b^m)$, solving the game is not possible at all. We'll discuss some ideas to solve the problem in details below." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Depth-limited search\n", + "One idea might be running the algorithm up to a specified depth instead of the searching the whole tree which(_depth-limited search_).\n", + "\n", + "![hello](1.jpg)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.5" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/notebooks/adversarial_search_fall_2021/images/1.png b/notebooks/adversarial_search_fall_2021/images/1.png new file mode 100644 index 00000000..f3890d52 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/1.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/2.png b/notebooks/adversarial_search_fall_2021/images/2.png new file mode 100644 index 00000000..fce08d6e Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/2.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/3.png b/notebooks/adversarial_search_fall_2021/images/3.png new file mode 100644 index 00000000..cb5ea6d7 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/3.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/4.png b/notebooks/adversarial_search_fall_2021/images/4.png new file mode 100644 index 00000000..460a2d43 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/4.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/5.png b/notebooks/adversarial_search_fall_2021/images/5.png new file mode 100644 index 00000000..7e3cd232 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/5.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/Alpha_Beta_Intro.png b/notebooks/adversarial_search_fall_2021/images/Alpha_Beta_Intro.png new file mode 100644 index 00000000..60216770 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/Alpha_Beta_Intro.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_1.png b/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_1.png new file mode 100644 index 00000000..4929f17a Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_1.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_2.png b/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_2.png new file mode 100644 index 00000000..446ef417 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_2.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_3.png b/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_3.png new file mode 100644 index 00000000..d5844656 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_3.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_4.png b/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_4.png new file mode 100644 index 00000000..07e83f29 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/alpha_beta_run_4.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/backgammon.jpeg b/notebooks/adversarial_search_fall_2021/images/backgammon.jpeg new file mode 100644 index 00000000..3ad7c605 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/backgammon.jpeg differ diff --git a/notebooks/adversarial_search_fall_2021/images/depth-limited.jpg b/notebooks/adversarial_search_fall_2021/images/depth-limited.jpg new file mode 100644 index 00000000..1f645679 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/depth-limited.jpg differ diff --git a/notebooks/adversarial_search_fall_2021/images/expectimax_tree.png b/notebooks/adversarial_search_fall_2021/images/expectimax_tree.png new file mode 100644 index 00000000..32cb5733 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/expectimax_tree.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/snakes_ladders_img.png b/notebooks/adversarial_search_fall_2021/images/snakes_ladders_img.png new file mode 100644 index 00000000..9fd0aec4 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/snakes_ladders_img.png differ diff --git a/notebooks/adversarial_search_fall_2021/images/zero-sum-game.jpeg b/notebooks/adversarial_search_fall_2021/images/zero-sum-game.jpeg new file mode 100644 index 00000000..28ac82da Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/images/zero-sum-game.jpeg differ diff --git a/notebooks/adversarial_search_fall_2021/index.md b/notebooks/adversarial_search_fall_2021/index.md new file mode 100644 index 00000000..c5b95fcb --- /dev/null +++ b/notebooks/adversarial_search_fall_2021/index.md @@ -0,0 +1,614 @@ +# Adversarial Search + + +## Table of Contents + +--- + + ++ ### + ### [Introduction](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#introduction-1) + + + ### + - ### [The concept of adversarial search and an introduction to Game Theory](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#the-concept-of-adversarial-search-and-an-introduction-to-game-theory-1) + + + ### + - ### [Game Theory Explanation and Deterministic Games](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#game-theory-explanation-and-deterministic-games-1) + + + ### + - ### [Different catergories of games](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#different-catergories-of-games-1) + + + + + ++ ### + ### [Different Types of trees for games](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#differnt-types-of-trees-for-games) + + + + ### + - ### [Single-Agent Trees](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#single-agent-trees-1) + + + ### + - ### [Adversarial Agent Trees](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#adversarial-agent-trees-1) + + + + ++ ### + ### [Minimax Strategy](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#minimax-strategy-1) + + + + ### + - ### [Pseudocode of Minimax Strategy](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#pseudocode-of-minimax-strategy-1) + + + ### + - ### [Properties of Minimax Strategy](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#properties-of-minimax-strategy-1) + + + + ++ ### + ### [Resource limit](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#resource-limits) + + + + ### + - ### [Depth-limited Search](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#depth-limited-search-1) + + + ### + - ### [Evaluation Function](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#evaluation-functions) + + + ### + - ### [Iterative Deepening Search](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#iterative-deepening-search-1) + + + + + ++ ### + ### [Pruning](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#pruning-1) + + + + ### + - ### [Houston, We Have A Problem!](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#houston-we-have-a-problem-1) + + + ### + - ### [Minimax Pruning](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#minimax-pruning-1) + + + ### + - ### [Alpha-Beta Pruning](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#alpha-beta-pruning-1) + + ++ ### + ### [Expectimax Search](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#expectimax-search-1) + + ++ ### + ### [Other types of games](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#other-types-of-games-1) + + + + ### + - ### [Multi-Agent Utilities](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#multi-agent-utilities-1) + + ++ ### + ### [Conclusion](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#conclusion-1) + + ++ ### + ### [Useful Linkes](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#useful-links) + + ++ ### + ### [Reference](https://github.com/mothegoat/notes/blob/master/notebooks/adversarial_search_fall_2021/index.md#references) + + +## +## Introduction + +
+ +### +### The concept of adversarial search and an introduction to Game Theory + + +In previous lectures, we discussed situations in which we had only a single agent. We didn't consider other parameters affecting our environment but in this chapter, a new type of search called **Adversarial Search** is introduced. In adversarial search, we define our problem in a multi-agent context. For instance, while playing a game, our agent has to consider the other agent's moves (adversarial moves) to play in an efficient way. Even in some games we can define a winning stategy which means we can guarantee that in every state of our game, no matter how bad it is, our agent is able to win the game. +To gather more +ormation about the concept of adversarial search, visit this link. + +### +### Game Theory Explanation and Deterministic Games + +
+Briefly, Game Theory is designing the strategies and the steps of our player to interact in the best way, according to the rival's steps and strategies. In other words, Game theory is the study of mathematical models of strategic interactions among rational decision-makers. +To achieve this goal, we need to express our game and its rules and actions in a mathematical form. The common model used to represent games is the model below: + +1. States of the game: , starting with \ + Which means how the game will look after each action of competitors. +
+
+ +2. Players: + represents the agents playing the game where the index varies from 1 to n. +
+
+ +3. Actions: \ + In every state, each player has a set of legitimate moves to change that state to another state by playing its turn. +
+
+ +4. Transition functions: S x A -> S\ + Transition function is a function which takes a state S, and does the action A to that state, and returns another state which is built as a consequence of that action. +
+
+ +5. Terminal tests: S -> {True, False}\ + It is a boolean function which determines whether our game has reached a final state or not. +
+
+ +6. Terminal utilities: S ✕ P -> R\ + It takes a state S, and a player P, and returns a Real number which indicates the score of the player P (The utility of the P, to be more precise) till that moment. +
+
+ +7. Policy/Strategy: S -> A
+ The Policy is the strategy defined by our AI program. It takes a state S, and according to the program, does the action A which is the best action for that situation. +
+
+ +### +### Different catergories of games + +
+ +Before we discuss types of games, we need to define a concept named **Utility Function** first. Each player has some states that will be more willing in them. For example, if the player P has a State S which is more satisfied in it, this means that P has better moves in S toward the goal state and is able to win the game more easily. So each player has a utility function which indicates that in every state, how satisfied the player is and what is the best result that it can score, starting with that state. Naturally in goal states, which the game has been finished, the utility function is maximum for that player. + + To know the game theory better, you can stop by here. + + + +

+ zero-sum-game
+ A good example on zero-sum vs. non-zero-sum games. (Credit: Market Business News)
+

+ +## +## Differnt Types of Trees for games + +
+ + + +### Single Agent Trees + +
+ +Consider the game Pacman which has only an agent playing the game. Starting from an initial state, our agent can either move to left or right. based on this, and its following moves, we can form a tree called **Single Agent Tree**. Each node of this tree indicates an state of the game and the utility function has an specific value in every one of them. We call the leaves of the tree **The Terminal Nodes**. In these terminal nodes, the game is finshed and the utility values are defined. The main question is how can we obtain the utility value of a node? According on what's been said, we have the utility values of the leaves. Recursively, to get the utility value of a node N, we calculate the utility value of every successor of N and we choose the maximum of those numbers as the utility value of the node N. +
+ +### Adversarial Agent Trees + +
+Like the previous Example, we explain this concept with the game Pacman but this time, the game has another agent called the adversarial agent or specific to this game, The Ghost. +Like Single Agent Trees, we can form a tree to express the moves of the agents, but the difference is that at each level of the tree, only one agent can play. For example if we start with the player P1 at the root of the tree, based on its move, we gain 2 successors and we go to the first level of the tree. At this level of tree, only the other player called P2 can move and each of the 2 nodes will have 2 successors based on P2's move. So we have 4 nodes in the second level and the game will continue this way till the end. In this Configuration, how can we calculate the utility value of nodes? This will be our next topic. + +## +## Minimax Strategy + +
+To gain the value of every node in Adversarial tree, we classify the nodes into two groups: +
    +
  1. The nodes our agent does action on them. The utility value of these kind of nodes is assigned by the maximum of the utility values of its successors. Because we want the maximum utility value for our agent. (The max side of the Strategy)
  2. +
  3. The nodes that the adversial agent does action on them. The utility value of these kind of nodes is assigned by the minimum of the utility values of its successors. Because we want the minimum utility value for the adversary agent. (The min side of the Strategy)
  4. +
+Now that we have every node's utility value, we are able to find the best action for each state. The main idea is at each state, we choose the node with the greatest minimax value and move to it. +Why does this strategy work? Because while forming the tree and assigning the utility function, we've always considered the best move for the enemy in his/her point of view and the worst move in our point of view. So we assumed that the enemy always plays its best shot. Now Two situations are possible to emerge: +
    +
  1. The adversary agent is very smart. In this case, we are ready beacuse the strategy is based on this situation.
  2. +
  3. The adversary agent is too naive and may do some inoptimal actions. In this case, we may even score more than what we gained as utillty value.
  4. +
+As a conclusion of what's said before, we can say that the utility value that we calculate for nodes in minimax strategy, is a floor value of what we may actually score. + + + + +### Pseudocode of Minimax Strategy + +
+ +```python +def max_value(state): + if leaf?(state), return U(state) + initialize v = -∞ + for each c in children(state): + v = max(v, min_value) + return v + + +def min_value(state): + if leaf?(state), return U(state) + initialize v = ∞ + for each c in children(state): + v = min(v, max_value) + return v +``` + + + +### Properties of Minimax Strategy + +
+
+ + + +Complete: + + +
+ +Yes, It is complete if the tree formed for the game is finite. + + +Optimal: + +
+ +Yes, It is optimal if the opponent acts optimally not naivly. otherwise, we will surely win the game but not in an optimal way because we could've won the game sooner and with a higher score. +
+ + +Time Complexity: + +
+ +, as it's obvious, it's dependant on the branching factor and the level the tree goes down. So on a game like chess with and, we can't use this strategy because the time complexity will be huge. +
+ + +Space Complexity: + +
+ +, because it's like DFS and only stores the nodes on the current path. + + + +## +## Resource Limits + +
+Although the minimax algorithm would be proper for problems with relatively small state space, it isn't an efficient and feasible one for problems with more complicated and larger state space. Since the number of game states it has to examine, is exponential in relation to the depth of the tree. + +Consider a _reasonable_ chess game with and . Due to the time complexity of this algorithm which is , solving the game is not possible at all. We'll discuss some ideas to solve the problem in details below. + + + +### Depth-limited search + +
+One idea might be running the algorithm up to a specified depth instead of searching the whole tree, which we call depth-limited search . + +![depth-limited](./images/depth-limited.jpg) + +But using this technique is not much satisfying because it leads to another issue: _How can we find the minimax value while there is no solution at a limited depth?_ +Recall how we find the minimax value of each node in the original form. We continue searching until we reach a final state and then use recursion to calculate the parents' values. But in the _limited_ format, there are no goal states at depth . + +Now to deal with this issue, we'll introduce _*Evaluation function*_. + + + +### Evaluation Functions + +
+The goal of an evaluation function is to answer to the question: How good is the current position? Or How probable is it to reach to winning terminal state from this position? + +Let us make this idea more concrete. An evaluation function returns an _estimate_ of the utility in each game state, just like the _heuristic_ functions that estimate the remaining cost to a goal state which were discussed before. + +Obviously, defining an appropriate and precise evaluation function is strongly effective on the player's performance. Actually an inaccurate evaluation function may lead to a losing position in the game. + +Most evaluation functions calculate the value of the position by using a _weighted sum of features_: + + + +Each of the is calculating a specific _feature_ of the state _s_.For instance, in chess, we would have features for the number of white pawns, black pawns, white queens, black queens, and etc. To differentiate the effect of each feature better, we multiply them by _weights_. + +Ideally, evaluation function returns exactly the minimax value of the current state. + + + + +### Iterative Deepening Search + +
+The accuracy of the evaluation function is critical in shallower states. And the deeper in the tree we search, the less the quality of the evaluation function matters. +So a good idea is to maximize the depth of our search as much as possible, considering the limits that we have. In other words, we are looking for an algorithm that can return an acceptable solution whenever we ask. This kind of algorithms are called _anytime algorithms_. These algorithms are expected to find better and better solutions the longer they keep running. + +So simply instead of running the depth-limited-search once, we start running the algorithm with an initial depth limit (k). Then after we found s policy, increase the depth limit (k' > k) and run the algorithm again with this new limit to find a better and more accurate solution. We continue the iteration until the program reaches the time limit. +This algorithm is called _iterative-deepening search_. + + +## +## Pruning + +
+ + + +### Houston, We have a problem! + +
+ +So far, we have seen some imporvements upon our original Adversarial Search algorithm. But the problem with minimax search is that the number of game states it has to examine is **exponential** in the depth of the tree. The logical way to tackle this problem is to cut off some nodes and not to see them at all. This brings us to the idea of "Pruning" which we use a lot when we can in AI algorithms. In the following chapter, we see two ways to prune our tree so we can solve our problem in less time. + + + +### Minimax Pruning + +
+ +If we take a look at how minimax works, it is noticeable that it explores some parts of the tree that are not necessary at all. In other words, exploring some branches are totally useless but time-consuming. If we can predict which branches are not worthy to be explored so that we can _prune_ them, we can improve minimax algorithm significantly. + +To understand better, consider the following example: +
+ prunning1 +
+ +We start searching with respect to the assumption that we visit leftmost nodes first. Finally when we reach the terminal node with utility 4, _we can deduce that the value of it's parent node(which is a *Min* node) is definitely less than or equal to 4_: + +
+ prunning2 +
+ +At this point, this information doesn't help us a lot. So let's continue searching. When we visit the terminal node with utility 3, we find that the value for the parent is also 3. An important observation here is that we can _predict_ that the value of the root _is more than or equal to 3_. That's because the root is a *Max* node. + + +
+ prunning3 +
+ + +After that, we continue searching, a terminal leaf with utility 2 will be discovered, and just like before, we can immediately notice that the value of it's parent is _less than or equal to 2_. + +
+ prunning4 +
+ +Now, let's ask ourselves: _Is it necessary that we explore the rightmost node?_. The answer is No. Because we already know that the value of the root is at least 3, while the maximum value of the right child of root is 2. So we can easily ignore exploring the last node and find that the minimax value of the root is 3: + +
+ prunning5 +
+ +This idea helps us finding a general and systematic algorithm to predict the _bad branches_ and stop exploring them, which is called _pruning_. + + + +### Alpha-Beta Pruning + +
+ + +As we are trying to improve our algorithm even more, there is another approach to prune branches that cannot possibly influence the final decision. So in the last section, we talked about Minimax Pruning, but how can we implement it? + +The approach we are trying to explain here as you can see in the title is called Alpha-Beta Pruning. + +The main idea here is that if a node (e.g. m) has a better choice than another node (e.g. n) or its parents, then that another node (n) will never be wanted to expanded. So we can prune it and never check its successor nodes. + +![depth-limited](./images/Alpha_Beta_Intro.png)
+*As m can give the root node a higher value, n will be pruned.* + + +We do this by keeping track of two parameters: + ++ : The value of best choice (i.e. highest value) we have found so far for a Maximizing Node (i.e. our player). ++ : The value of best choice (i.e. lowest value) we have found so far for a Minimizing Node (i.e opponent player). + +### Sample Code: + +Here is a Python function to make you understand the concept better. + +```python +def alpha_beta_pruning(position, depth, alpha, beta, maximizing_player): + if depth == 0 or game_state(position) == GAMESTATE.GAMEOVER: + if terminal_utility(position): + return terminal_utility(position) + return eval_func(position) + + if maximizing_player: #for maximizing player (our player) + max_eval = - math.inf + for child in position.children: + eval = alpha_beta_pruning(child, depth - 1, alpha, beta, false) + max_eval = max(max_eval, eval) + alpha = max(max_eval, eval) + if beta <= alpha: + break + return max_eval + else: #for minimizing player(opponent player) + min_eval = + math.inf + for child in position.children: + eval = alpha_beta_pruning(child, depth - 1,alpha, beta, true) + min_eval = min(min_eval, eval) + beta = min(beta, eval) + if beta <= alpha: + break + return min_eval +``` + + +So to initialize this function, first we need to give alpha the worst value for our player, which is , and give beta the worst value for the opponent, which is +inf. + +```python + #initialization + alpha_beta_pruning(current_position, 0, - math.inf, + math.inf, true) +``` + +Now let's look at an example while running this code. + +

+ alpha_beta_run_1
+ We approach the first leaf and since the parent is a Minimizing Node, its beta value gets updated
+ alpha_beta_run_2
+ Second leaf has a bigger value, so since beta always updates at a lower value, nothing changes here
+ alpha_beta_run_3
+ The algorithm goes back to the root's second child and then goes to the third leaf. Keep in mind that the parent's alpha and beta values always gets passed to their children.
+ alpha_beta_run_4
+ Now since beta is less than alpha, we know that our root's best possible value from this child will be a 3 which is lower than its other child (i.e. 5), so we prune the other leaf.
+

+ + + + +## +## Expectimax Search + +
+ + +As we can see in the real world, not all the games are played only based on players' skills. We have so many games with stochastic variables in them, like throwing a dice or picking a random card. These games are called **stochastic games**. + +Even when there is no element of luck, we can expect unpredicted behaviors from our agent or the opponent's agent, like robot slipping on the wet floor, or the opponent taking a worse route. + +So the problem we face here is that while we know what legal actions we can take, we can't afford to know what moves will the opponent be able to take as takes its random element into use.(i.e. The transition function will be variable as the game goes on.). + +Let's introduce a new definition called **chance nodes**. We represent these kind of nodes as a circle in the game tree. + +Let's look at this with a example from the epic, legendary game, Snakes And Ladders. + +

+ snakes_ladders
+ This games is played by two or more players. Each can throw a dice at its turn and should move as many as the dice number.
+

+ +So in this game, there is an element of chance. if the opponent rolls the dice on 1, it can move one square, if it rolls the dice on 2, it can move two squares, and so on. So the main point here is that the action the opponent can make depends on a random number from dice which its probability is at a time. + +So the new game tree will be something like this: + +

+ expectimax_tree
+ The minimal expectimax tree for snakes ladder game.
+

+ + + +Node values in Expectimax Search: + +
+ +As we deal with probability in this kind of search so much, we expect to calculate the values based on some kind of probabilty. + +Terminal nodes and MAX and MIN nodes (for which the dice roll is known) work exactly the same way as before. So how should we calculate the value of chance nodes? + +For every chance node, the value is calculated from the expected value of its childer. +For example the value of every chance node in Snake ladder will be: + + + +So the general formula for values will be as follows: + +| value | condition | +|------|--------| +| UTILITY(s) | s = terminal node | +|max EXPECTIMAX(Eval(s, a)) | PLAYER(s) = MAX| +|min EXPECTIMAX(Eval(s, a)) | PLAYER(s) = MIN| +| P(r)EXPECTIMAX(RESULT(s,r)) | PLAYER(s)= CHANCE| + +## +## Other types of games + +
+On top of what's been discussed before, we can count another type of game called Mixed Layer Types. In this type, in addition to players' action, we have another random factor affecting the game. So in our computations we must consider this factor. Foe example, Backgammon is a Mixed layer types game, because rolling the dice will affect the states and result of the game. + +To do some calculations about the depth of the game tree, we can state: + +b = 1 because 21 dintinct states are created as an outcome of rolling 2 dices. + + + +So if we want to build the tree till the depth of 2, we will have leaves and termainal states. As we go deeper in tree, this number increases. + +As the deepness increases, the probability to reach a goal state gets smaller and limiting the depth gets less damaging. So the usefulness of search will be threatened and pruning becomes trickier. + +Td-Gammon is a computer backgammon program developed in 1992. It uses a 2-depth search, a very good evaluation function plus reinforcement learning and what we get, is a champion-class player! TD-Gammon was the first AI world champion in any game! + + + +### Multi-Agent Utilities + +
+In games with multiple players, terminal states have utility in form of tuple values and each player tries to maximize its own part. This idea can raise the cooperation and competetion strategies dynamically. + +## +## Conclusion + +
+ +Before this, we only discussed games with a single player. But we use Adversarial search when there are other players and agents in the game too, often with opposite objectives. With adversarial search, we are seeking a method called **Playing Strategy** that gives us correct directions and instructions at each state of game. + +We classified games in two main groups based on utility functions: +- Zero-Sum Games: The sum of agents' utility functions is equal to zero. (pure competition) +- General Games: independent utility functions. (cooperation may be beneficial) + +We formed trees for both single and adversarial agent games and we found a strategy to obtain the value of nodes in both situations. + +Optimal strategy against an opponent playing perfectly can be found using **Minimax strategy**. We can also determine a floor value of the score that can be achieved in every game. +A complete search is often impossible in minimax because of resource limits. So There are two options: +- Replace the utility function with an **Evaluation Function** that estimates utility. This function must be defined very precise, because the whole solution depends on this. +- Do **Minimax Pruning**: While calculating minimax values, it's sometimes possible to detect and **prune** some suboptimal branches that we know we'll never visit them during the algorithm. + +Another point that has to be considered is that While we are looking for a good strategy of playing, we must neighter act Optimistically nor Pessimismtically, because they can both result in suboptimal behaviour so it's important to evaluate each problem realistically and choose our strategy in proportion of the adversarial agent's cleverness. + +Uncertain factors (like rolling dice) are represented by Expectation Nodes in a game tree. Optimal strategy in such games can be found using **Expectimax Search** which was explained before. + +## +## Useful links + + ++ [Step-by-Step alpha-betta example](https://www.youtube.com/watch?v=xBXHtz4Gbdo) ++ [A simple overview of adversarial search with some examples](https://www.baeldung.com/cs/expectimax-search) ++ [More about adversarial search in Chess game](https://medium.com/@SereneBiologist/the-anatomy-of-a-chess-ai-2087d0d565) ++ [About minimax time complexity](https://stackoverflow.com/questions/2080050/how-do-we-determine-the-time-and-space-complexity-of-minmax) + + + + +## +## References + +
+ + ++ https://towardsdatascience.com/understanding-the-minimax-algorithm-726582e4f2c6 ++ Russell, S. J., Norvig, P., & Davis, E. (3rd. Ed). Artificial Intelligence: A modern approach. Pearson Educación. ++ https://slideplayer.com/slide/4380560/ ++ https://www.cs.cornell.edu/courses/cs312/2002sp/lectures/rec21.htm ++ https://www.techslang.com/definition/what-is-adversarial-search/ ++ https://en.wikipedia.org/wiki/Game_theory ++ https://brilliant.org/wiki/minimax/#:~:text=In%20game%20theory%2C%20minimax%20is,payoff%20as%20large%20as%20possible. ++ https://en.wikipedia.org/wiki/Minimax ++ Sebastian Lague's Youtube channel diff --git a/notebooks/adversarial_search_fall_2021/matin_daghyani.jpeg b/notebooks/adversarial_search_fall_2021/matin_daghyani.jpeg new file mode 100644 index 00000000..a307675d Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/matin_daghyani.jpeg differ diff --git a/notebooks/adversarial_search_fall_2021/metadata.yml b/notebooks/adversarial_search_fall_2021/metadata.yml new file mode 100644 index 00000000..0a8509a8 --- /dev/null +++ b/notebooks/adversarial_search_fall_2021/metadata.yml @@ -0,0 +1,44 @@ +title: Adversarial Search # shown on browser tab + +header: + title: Adversarial Search # title of your notebook + description: An intro into aversarial search # short description of your notebook + +authors: + label: + position: top + content: + - name: Mohammad Abolnejadian + role: Author + image: mohammad_abolnejadian.jpeg + contact: + - link: https://github.com/mothegoat + icon: fab fa-github + - link: mailto:mohammad.abolnejadian@gmail.com + icon: fas fa-envelope + + - name: Mohammadali MohammadKhani + role: Author + image: mohammadali_mohammadkhani.jpeg + contact: + - link: https://github.com/MoaliMkh + icon: fab fa-github + - link: mailto:mohammadali1379@gmail.com + icon: fas fa-envelope + # optionally add other contact information like + # - link: # contact link + # icon: # awsomefont tag for link (check: https://fontawesome.com/v5.15/icons) + + - name: Matin Daghyani + role: Author + image: matin_daghyani.jpeg + contact: + - link: https://github.com/mtndaghyani + icon: fab fa-github + - link: mailto:matin.daghyani@gmail.com + icon: fas fa-envelope + +# comments: +# # enable comments for your post +# label: false +# kind: comments \ No newline at end of file diff --git a/notebooks/adversarial_search_fall_2021/mohammad_abolnejadian.jpeg b/notebooks/adversarial_search_fall_2021/mohammad_abolnejadian.jpeg new file mode 100644 index 00000000..dce86a87 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/mohammad_abolnejadian.jpeg differ diff --git a/notebooks/adversarial_search_fall_2021/mohammadali_mohammadkhani.jpeg b/notebooks/adversarial_search_fall_2021/mohammadali_mohammadkhani.jpeg new file mode 100644 index 00000000..1627d0c9 Binary files /dev/null and b/notebooks/adversarial_search_fall_2021/mohammadali_mohammadkhani.jpeg differ diff --git a/notebooks/index.yml b/notebooks/index.yml index 1b2133c0..4dfca602 100644 --- a/notebooks/index.yml +++ b/notebooks/index.yml @@ -18,6 +18,8 @@ notebooks: kind: S2021, LN, PDF - notebook: notebooks/8_adversarial_search/ kind: S2021, LN, Notebook + - notebook: notebooks/adversarial_search_fall_2021/ + kind: F2021, LN, Notebook - notebook: notebooks/9_bayesian_networks/index1.ipynb kind: S2021, LN, Notebook metadata: notebooks/9_bayesian_networks/metadata.yml