Problem-solving searches require trial and error in that they generally do not go directly to the solution without traversing and retracing some blind alleys—sometimes many, sometimes few. When a person solves a problem without any backtracking whatsoever, we are apt to deny that he needed to think at all. We say, “He knew the answer,” or “he didn’t have to think; he did it by rote.” — /Thinking by computers/, Herbert A. Simon
Why search for the answer if you can just know it instead?
No matter how an algorithm is coded, it can be reduced to a series of if-then statements in a loop. The if-then statements can be represented in a lookup table, where each row in the table is a different condition. Thus, even a complex search algorithm can be reduced to a lookup table, assuming one has all the requisite data to produce such a table.
For example, we could devise a search algorithm that plays good tic-tac-toe. Or, we can produce a table (with less than 1024 entries) that describes which move to take depending on the current board configuration.
Why use a search if we can make a lookup table instead? There are three reasons: two practical, one intellectual.
A lookup table may be too large when fully expanded. A typical chess game lasts about 40 moves. To represent all possible moves and counter-moves would require a lookup table with 10120 entries. This is, naturally, impossible without some extremely clever encoding scheme, since the universe is estimated to contain less than 1081 atoms.
Using a search algorithm that generates possible moves when requested avoids this impossible space requirement. This way, the lookup table is never available in its entirety. Instead, each possible move and corresponding action are generated as-needed.
The natural world, unlike chess, does not seem to obey a simple set of rules. Imagine a robot placed in a random building; the robot has no way of mapping this building before it arrives there, so it cannot make a lookup table of all the different movements it can make through the building.
Also consider a web search engine; there is simply no way to predict the content and links that webpages will contain as people publish more and more material. Likewise, there exists an infinite variety of search queries. Thus, no lookup table relating all possible queries to ever-changing web pages can be generated.
A lookup table is impossible, yet again. Search is required.
Lastly, a lookup table tells us nothing about how a problem is solved. A table that contains all the moves of tic-tac-toe or checkers or chess gives no insight about how a good game may be played. There may be reasons certain moves are better than others but the lookup table does not offer those insights.
A search algorithm, on the other hand, describes intelligent behavior, assuming the intelligent thing to do is to search. For example, we may read the Google Search source code and see that a page is ranked more highly if other highly-ranked pages link to it. The algorithm may not be appropriate (although Google Search is quite successful, so it probably is appropriate), but nevertheless the code describes a certain process that we, as programmers, may find to be meaningful and possibly intelligent.
A problem may be solvable by a search technique if the problem has each of the following features:
- Initial state
- some description of the agent’s starting situation
- Possible actions
- the set of actions (such as chess moves) available to the agent, also called “applicable” actions; the possible actions depend on the state
- Transition model
- some way of figuring out what an action does;
in other words, a
resultOf(state, action)
function which returns a state; the transition model defines a state space, which takes the form of a directed graph (vertices are states, edges are actions) - Goal criteria
- some way of finding out if the agent has reached
its search goal; the function
goal?(state)
is a boolean function (i.e., a predicate) - Path cost
- a function that calculates the cost of a path (a sequence of actions); for example, a driving agent may calculate a path cost by adding the number of driving minutes between each city in its route
Of course, each of these components assumes a liberal abstraction away from most of the details that the agent will be facing in reality. For example, a driving agent may have a transition model that shows that driving north on some street results in arriving at the center of town. Of course, this is a ceteris paribus assumption (“everything else being equal”); if something bad happens, like a flat tire, this transition model is useless.
In computer science, we typically compare algorithms based on two criteria: “time complexity” and “space complexity.”
The time complexity of an algorithm is generally the number of operations required (e.g., basic CPU operations such as addition, subtraction, tests for equality, memory reads and writes, etc.).
For example, suppose we have a min
algorithm that finds the smallest
value in any list. The algorithm checks every element in the list, so
its “time complexity” is
On the other hand, inefficient sorting algorithms have
Normally, we can make an algorithm faster by using more memory. We
could, for example, save the answers of some computations to avoid
computing them again. The min
function has
For comparing search algorithms in particular, we’ll often want to talk about the following values:
-
$n$ , the total number of possible states -
$b$ , the branching factor; this is the maximum number of possible actions or branches we might expect from any state in the search space. -
$d$ , the depth in the search tree of the least-cost path, i.e., the minimum number of steps required to reach the goal -
$m$ , the maximum search depth, if we choose to make it a finite number (so that the algorithm might stop looking deeper and instead choose to backtrack to an earlier state), or$∞$ otherwise