And yet players typically develop some basic rules of thumb, they dont go calculating huge tables of permutations. What scared devs, I think, is how the options multiply. You don't have to go through all options, they key is tree pruning.

Even a chess engine might go searching out 5-10 moves for each likely branch. If you add the human type rules of thumb to a pruned search you get through fast with not a lot of number crunching.

Interesting article on the state of AI for Go: http://www.newyorker.com/online/blogs/elements/2014/03/the-electronic-holy-war.html

Even tree pruning takes number crunching, and the less clear-cut the measurement of success is, the harder it gets.

Not a bad article, but contains a huge lie:

To say that Go is more complex than chess, though, is a little like saying that one infinity is larger than another. While technically true—and mathematically possible—it does not fully explain why computers, which can’t fully compute chess or Go, have become good at one and not the other. “A few hundred orders of magnitude don’t matter when you’re up in the ten to the one hundred and twenty,” Murray Campbell, a member of the I.B.M. Deep Blue team, told me. The key lies in Go’s structure. Deep Blue was able to exploit a weakness in chess’s armor: at the grandmaster level, to tell who is winning, you add up the pieces on the board and consider their positions. Campbell explained that, to win, you just stay ahead the whole time, vastly reducing the number of moves for computers to consider.

The truth is that present-day computer chess uses two kind of 'heuristics' to evaluate moves: positional heuristics (basically, an algorithm that assigns a numeric value to a given game configuration) and, more importantly, a cleverly crafted 'pattern database' that maps game states into a vast - but represented in a very compact form - database of chess end games (think about all the chess problems you find in the newspaper, compiled together). What positional heuristics do (or rather, ensembles of such heuristics) is to speed up the search up to the mid game. The reason for this is that, as less pieces are on the board, the values of such positional heuristics become less informed (which means that is harder to tell apart good from bad states). The solution to that was enabled by 'pattern database heuristics', that basically rate states according to how similar they are to game states enabling a known end game (that is, a sequence of moves where the player has victory guaranteed).

This strategy doesn't work for Go for several reasons. First, the game tree is exponentially bigger than that of chess, so weaknesses in positional heuristics (all heuristics, by definition, have a 'blind spot') early in the search can lead you astray very easily. And then, Go "end game" isn't as well documented as that of Chess, so nobody has come out with good pattern database heuristics (and there are well-founded doubts about the technical feasibility of building such pattern databases in the first place).

In order to succeed in Go, Go programs had to abandon the algorithmic framework used in Chess since Claude Shannon's wrote his seminal paper on the subject (presenting the Min-Max algorithm). The problem with Min-Max (and it's evolutions like Alpha-Beta Min-Max or NegaScout) is that is

**too** systematic even with pruning rules. That systemacity - which guarantees optimal play provided enough time to compute the moves - had to be abandoned. And here we come with the Monte Carlo approaches: which are derived from the methods to solve the "n-armed bandit problem" (that is, how to figure out the strategy that maximizes profit when playing a slot machine).

These methods are based on 'rolling out' a base strategy (which in the case of a board game, encodes the kind of positional or lookahead reasoning of chess heuristics), and adjusting this base strategy by

**sampling** possible executions (i.e. the program applies the strategy until reaching a terminal - win or loss - state) and adjusting the expected pay offs accordingly. The 'magic' of these methods lies in being able to account for the variance in payoffs, so the computer program prioritizes those strategies which 1) held the promise of better payoffs and 2) haven't been explored so often. And the level of play of these programs is proportional to the number of strategy evaluations they can make per unit of time.

The fact they play differently as is common amongst humans allows to make an interesting 'meta-physical' question. Games - all games - can be formalized into a mathematical model (which can be more or less 'workable'). We humans play these games, but usually only get a glimpse of all the possible developments of the game. We are

*taught* to play these games in very particular ways: as in Chess, strategies are to a great degree a cultural construct built around the game. The Monte-Carlo go-playing program doesn't leverage this cultural baggage (which is useful in order to play the game effectively) and explores ways of playing which have never crossed our minds. It's a bit like these programs to look for you birthday date into the decimal digits of the number pi - you're guaranteed to find it, infinitely often. But it is also true that you can find there patterns nobody has thought of yet (and assigned to them any meaning).