Awaken, foul beast! (Game AI)

rwould · Post by **rwould** » Tue Jun 10, 2003 3:28 pm

The difficulty comes in that each move you need to make a fresh assessment, but each assessment needs to look at what the future moves are likely to be which will benefit from the move made. In addition you need to assess the position of players on subsequent turns.

I've considered attempting to sort out some form of AI for this, but it is a very tricky proposition. To fully understand how complex it is try playing a game against yourself, and writing down every single consideration you make before making a move. And then add all the silly things that you automatically discarded.

Because of the increased variables caused by repeated moves and different player statistics it is far more complicated to do a simple AI process for BB than for a number of other games, in particular if you wish to generate a self-learning system.

Hope it makes sense!

Richard

Jerhod · Post by **Jerhod** » Tue Jun 10, 2003 3:41 pm

bitMonkey wrote:Jerhod: Asymptotically, in theory, overtraining should not be a problem. Theory is all good and well though... maybe there is a problem here, I am not sure. The state space is huge, after all. However, it would not have to stop learning after having trained by playing against itself. It could store away a game played against a human, or several humans, and study them later. So the next version could be better than the previous.

I'm not saying that it would stop learning after playing games against itself. However, it still can take a really long time to retrain the computer.

I'm not sure what AI the backgammon game runs on, but I've built a neural net myself and had other experiences working with and studying AI's. The idea behind machine learning on a neural net is that the factors that have the most influence are updated the most when the machine learns something. For example, if factor #1 in the AI contributed twice as much to making a decision that turned out to be good than factor #2 did, then factor #1 will be reinforced twice as much.

Consdier this: if you've played 400,000 games and have determined that strategy X is much, much better than strategy Y, what would you do after losing 100 games in a row using strategy X when strategy Y would have let you win? You probably wouldn't change at all and would still use strategy X because strategy X is, on average relative to your data set, still a much better strategy to use.

It's the same idea with machine learning. If a computer plays 400,000 games against itself and decides that factor #1 is much, much better than factor #2 when deciding what move to make, then you'll have to have the computer play at least thousands of games against humans before it gets the idea that maybe factor #2 could be significant. The reason for this is that factor #2 will have such a low level of significance after the 400,000 games that it will hardly be updated at all after any one game because it had such a small effect on the game being played.

Here's an incredibly funny real world example of this. The army (I believe the US army, but I'm not 100% positive) paid for a research group to develop a machine that could take a snapshot of a landscape and determine whether or not there was a tank hiding somewhere in the landscape. The research group developed the hardware (a big, high quality camera) and the software involved to analyze the pictures. The research team took pictures of landscapes without tanks and had the army drive tanks into the landscape, camoflauge them, and then retake the pictures. They trained the machine on these pictures and the machine got pretty darn good at pointing out hidden tanks. Fabulous, right?

The machine when tested in the field was horrible and very inaccurate. What had happened? Well, all of the pictures with tanks in them that the machine had trained on were taken in the day time because that's when the research group could find people to drive tanks around. All of the pictures without tanks were taken in the evening after the tanks had been returned to the army base. All the machine had learned to do was tell day time from night time! The research group showed another set of photographs to the machine to try and retrain it, this time using a variety of combinations so that a similar problem wouldn't occur, but the machine wouldn't change significantly. The machine was so "convinced" that only day vs. night mattered and would not learn anything new. The research group had to reset the machine and begin training it over from the beginning.

Post by **DoubleSkulls** » Tue Jun 10, 2003 4:03 pm

Jerhod wrote:Here's an incredibly funny real world example of this. The army (I believe the US army, but I'm not 100% positive) ...

IIRC it was the British Army & Imperial College did the work.

Jerhod · Post by **Jerhod** » Tue Jun 10, 2003 4:07 pm

ianwilliams wrote:IIRC it was the British Army & Imperial College did the work.

Thanks!

grep-v · Post by **grep-v** » Tue Jun 10, 2003 4:30 pm

ianwilliams wrote:
grep-v wrote:By looking at a position you can tell if a TD is possible, but looking at two positions in general you cannot tell which one is better.
Its pretty hard but you ought to be able to score a given board position e.g. how close am I to scoring, how well protected is the ball/ball carrier etc, or not if you are on D.

Ok, how would this be done?
(No definitive solution expected, just a guidelin which directions are right

)

grep-v · Post by **grep-v** » Tue Jun 10, 2003 4:47 pm

Jerhod wrote: I'm not sure what AI the backgammon game runs on, but I've built a neural net myself and had other experiences working with and studying AI's. The idea behind machine learning on a neural net is that the factors that have the most influence are updated the most when the machine learns something. For example, if factor #1 in the AI contributed twice as much to making a decision that turned out to be good than factor #2 did, then factor #1 will be reinforced twice as much.

I suppose you used backpropagation (from your description) which is not the only learning technique. Still a problem: neural nets are classification mechanisms. You have an input vector and want to know which class it does belong to. The (3-layer) neural net can classify any n-dimensional space (i.e. input vector of arbitrary length), but you have to train it before! In training you must have clearly classified test vectors and teach the net how to classify them by backpropagation. Other methods include genetic algorithms, Monte Carlo, the whole other optimization bunch. The problem with BB is the following: classification of positions is nearly impossible, at least for the number of test problems you would need to train a net that can juggle with all the BB-parameters. Sooooo, no easy way in using neural nets for more than subproblems.

Jerhod wrote: If a computer plays 400,000 games against itself and decides that factor #1 is much, much better than factor #2 when deciding what move to make, then you'll have to have the computer play at least thousands of games against humans before it gets the idea that maybe factor #2 could be significant. The reason for this is that factor #2 will have such a low level of significance after the 400,000 games that it will hardly be updated at all after any one game because it had such a small effect on the game being played.

A bit simplified and also wrong. It solely depends on your learning method. Try out some Hopfield nets. They can classifiy two different 2D-patterns. As soon as you try to train them to recognize a third pattern they slowly "forget" one or both of the previous patterns. The rate of decay depends on your learning function.
Neural nets are a powerful method of classifcation by cascaded linear discrimination (science-babble warning

). But that's all they are, no wonderful miracle solution anywhere. AND there are equivalent and different powerful tools for similar problems.

Jerhod · Post by **Jerhod** » Tue Jun 10, 2003 4:49 pm

grep-v wrote:
ianwilliams wrote:
grep-v wrote:By looking at a position you can tell if a TD is possible, but looking at two positions in general you cannot tell which one is better.
Its pretty hard but you ought to be able to score a given board position e.g. how close am I to scoring, how well protected is the ball/ball carrier etc, or not if you are on D.
Ok, how would this be done?
(No definitive solution expected, just a guidelin which directions are right )

I think that's the problem - there is not a good guideline for how to do this. And I'm not sure that there ever will be a good one that's feasible for your average computer.

The best way I can think of to do this would be to do a search along the lines of:

1) Is a TD possible?
2) If so, in how many ways is a TD possible?
3) Of these ways, what is the method that is statistically most likely to result in a successful TD attempt?
4) If you were not to make a TD attempt, could you position yourself to make a TD attempt next turn more likely than it is this turn?
5) What would be the chance of being able to establish that position?
6) Would the joint probability of setting up the position with the chances of making a TD from that position be greater than the chance of making a TD attempt this turn?

And then you'd have to continue for "can you position yourself so that in two turns you'll have a possible TD attempt that is more likely than a TD attempt next turn or this turn?" and so on...

I think that a coputer would be able to get up to #3, but I doubt that a computer could get through #4.

Skummy · Post by **Skummy** » Tue Jun 10, 2003 4:50 pm

grep-v wrote:Neural nets are a powerful method of classifcation by cascaded linear discrimination (science-babble warning ).

You had me at backpropagation, grep-v. You had me at backpropagation.

tonythep · Post by **tonythep** » Tue Jun 10, 2003 4:52 pm

grep-v wrote:
ianwilliams wrote:
grep-v wrote:By looking at a position you can tell if a TD is possible, but looking at two positions in general you cannot tell which one is better.
Its pretty hard but you ought to be able to score a given board position e.g. how close am I to scoring, how well protected is the ball/ball carrier etc, or not if you are on D.
Ok, how would this be done?
(No definitive solution expected, just a guidelin which directions are right 8) )

If I had to guess, it'd look something like this:

For the team with the ball:

RUN_PLAY:
Is there a player within (MA+2) spaces of the goal?
If so, is he also within (carrier's MA+2) spaces of the carrier?
If so, we have a developing situation; the offense can score on this turn with a handoff at worst.

Now we calculate probabilities based on the nominated scorer's AG: dodge rolls, catching the handoff, etc. We come back a few thousand trials later to find out where the percentage falls below the not-worth-it barrier (which likely moves, depending on the game situation).

Once we've done that, we go to the next series of calculations:

PASS_PLAY:
Is there a player within (MA+2) spaces of the goal?
If so, how close can the ball carrier get to him in (MA+2) moves? We'd want to weight this so that the ball carrier prefers to throw unmolested -- no point in getting one square closer if that takes you into TZs.

And then start calculating odds based on proximity, thrower's AG, receiver's AG.

This information is all neutral; we're not yet asking whether we want the offense to score (because we have the ball) or we want to stop them (because they do). It's just a big grinding of percentages. Once we have the percentages, we can start working on improving them: i.e. hitting or obstructing inconvenient opposing players.

All of this is hugely simplified, of course, and is likely to have at least one outright error in it because I'm pre-coffee. But you asked.

Jerhod · Post by **Jerhod** » Tue Jun 10, 2003 4:53 pm

I think it's clear that my AI knowledge is very limited compared to the knowledge of others here, so I think at this point I'll defer to others...

Post by **DoubleSkulls** » Tue Jun 10, 2003 4:54 pm

Lots of values

Distance from scoring - this also depends on the number of turns remaining and a scoring strategy.
How well defended the ball carrier is. This can be assessed by the odds of your opponent getting a block (1/2, 1 or 2) on the carrier - and how likely they are to survive - a block dodge sure hands player isn't too worried by a 1/2 dice block. Can they be reached at all? Can the opponent put a TZ on them?
How well defended a ball on the ground is.
Number of players who can get blocked in opponents turn (1/2 dice blocks are potentially good, 2 dice blocks aren't).
Players vulnerable to being blocked or blitzed out of bounds
Number of my players who can score next turn (weighted by risk).
Number of opposing players who can score next turn (weighted by risk).
etc

Theres more, but I think like Richard said, you need to sit down and work it all out with a bit of pen & paper - and get your opponent to do the same.

Actually thinking about it, although positional weightings may be useful once a strategy for the turn has been decided I don't think it will help over a longer time span.

You normally have a strategy which covers the drive - quick score (asap taking moderate risk)/ safe score (3/4 turns low risk) /slow score (7-8 turns very low risk).

Defence also has several styles - stand off, cover receivers, chase ball carrier and associated risk levels e.g. I'll try for a quick turnover and score. So I kick deep, flood the backfield and try to make my opponent turnover. If it doesn't work, well lets try it again next time.

redlizard · Post by **redlizard** » Tue Jun 10, 2003 5:08 pm

If anyone's seriously thinking about tackling this monster, I would suggest starting by seeing if you can successfully program a simpler subset. See if you can program a one-off match of orc vs. humans with premade rookie teams. If this is successfully pulled off then it can later be expanded with other teams, skill selection, etc.

Best to solve a small problem first rather than a huge mess.

grep-v · Post by **grep-v** » Tue Jun 10, 2003 9:59 pm

Jerhod wrote: The best way I can think of to do this would be to do a search along the lines of:

1) Is a TD possible?
2) If so, in how many ways is a TD possible?
3) Of these ways, what is the method that is statistically most likely to result in a successful TD attempt?
4) If you were not to make a TD attempt, could you position yourself to make a TD attempt next turn more likely than it is this turn?
(......)
I think that a coputer would be able to get up to #3, but I doubt that a computer could get through #4.

Funny, I wanted to interrupt that enumeration at point 4.

You are perfectly right, not because an algorithm couldn't cope with it, but because this is the point where we need to break things down into smaller things. We can't calculate every possibility, so it's necessary to retreat to solving smaller subproblems. The result will be only an approximation to the optimal solution, but the task is now to make this approximation better and better. You still won't find the optimal BB-tactics, but perhaps a nice opponent to train with, when your buddies are too bored.

grep-v · Post by **grep-v** » Tue Jun 10, 2003 10:02 pm

tonythep wrote: If I had to guess, it'd look something like this:

For the team with the ball:

RUN_PLAY: (........)
PASS_PLAY: (........)

(.....)
All of this is hugely simplified, of course, and is likely to have at least one outright error in it because I'm pre-coffee. But you asked.

Yep! Now take a look to my second or third posting. The things you simplified are the strategical patterns, the calculations are the tactical patterns (I just broke it up into smaller parts).

grep-v · Post by **grep-v** » Tue Jun 10, 2003 10:21 pm

ianwilliams wrote:Lots of values

Distance from scoring - this also depends on the number of turns remaining and a scoring strategy.

How well defended the ball carrier is. This can be assessed by the odds of your opponent getting a block (1/2, 1 or 2) on the carrier - and how likely they are to survive - a block dodge sure hands player isn't too worried by a 1/2 dice block. Can they be reached at all? Can the opponent put a TZ on them?

How well defended a ball on the ground is.

Number of players who can get blocked in opponents turn (1/2 dice blocks are potentially good, 2 dice blocks aren't).

Players vulnerable to being blocked or blitzed out of bounds

Number of my players who can score next turn (weighted by risk).

Number of opposing players who can score next turn (weighted by risk).

etc
Theres more, but I think like Richard said, you need to sit down and work it all out with a bit of pen & paper - and get your opponent to do the same.

That's the right direction.
Point 1 isn't part of a statical evaluation as many subsequent turns have be evaluated, likely not possible, so it's out.
Points 2,3 correspond to the "square importance level".
Points 4,5,6 correspond to "impact level" (in a lesser view also "opp. threat level")
Point 7 corresponds to "ooponents threat level"
So we would have to find a scoring scheme for all these points and probably the points 8+.

ianwilliams wrote: Actually thinking about it, although positional weightings may be useful once a strategy for the turn has been decided I don't think it will help over a longer time span.

So we have to recalculate it whenever something has changed, i.e. an action has been carried out or has come to a critical point (reroll on a push-block? etc.)

ianwilliams wrote: You normally have a strategy which covers the drive - quick score (asap taking moderate risk)/ safe score (3/4 turns low risk) /slow score (7-8 turns very low risk).
Defence also has several styles - stand off, cover receivers, chase ball carrier and associated risk levels e.g. I'll try for a quick turnover and score. So I kick deep, flood the backfield and try to make my opponent turnover. If it doesn't work, well lets try it again next time.

Sounds good again. And we must have the capability to switch between short term strategies in case something goes wrong (block is a standoff, catcher can't break through, so don't pass but take the thrower back a few squares) or a potential scoring situation arises (opp. thrower fumbles ball, own players are within range).
So the short term strategy has to re-evaluated on a regular basis too.