The Diplomacy |
After the Fall moves have been played, and the retreats (if any) made, each player's number of units must be adjusted to equal the number of supply centers his Great Power controls.Last issue we discussed the complexity of the search problems posed to the computer diplomat by the retreat phase in the game of diplomacy. In comparison, how difficult a problem is posed by winter adjustments (builds/removals)? To measure this problem, we analyze the worst-case scenario that a potential computer diplomat might face.
An initial version of the adjustment portion of the strategy finder of the Bordeaux diplomat was written by Brian Chojnowski (University of Illinois, Champaign-Urbana). Some alternate algorithms were proposed by Jean-Michel Faure (Bordeaux) and Per Westling (Linkoping, Sweden). The current version programmed by Gilles Schaeffer (Bordeaux, ENS/Paris) is presented below. We conclude with possible future developments.
As we have seen there are 4,430,690,040,914,420 significantly different Spring 1901 openings in Diplomacy. Searching completely such a large space is an impossible task that grows worse in successive movement phases as a typical game of Diplomacy progresses.
On the other hand, a typical retreat phase involves only a small number of units with limited interactions, and tightly constrained movements. We can thus make a complete search of all possible retreats combinations. (Actually, we don't really consider all combinations, since "independent" retreats are optimized separately in order to gain time for a 2 or 3-ply look-ahead in our search.)
If he has fewer centers than units, he must disband the excess units only, by removing them from the board. The units removed may be any of his units he chooses.A player can have any number n of units up to 17. He may then be required to remove k of them where k is some number between 0 and n. He can remove any k he wants. He can remove no more and no less.
Thus, the number of choices is exactly the binomial coefficient n!/k!(n-k)! as given in the table below.
n\k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 1 2 2 1 3 3 3 1 4 4 6 4 1 5 5 10 10 5 1 6 6 15 20 15 6 1 7 7 21 35 35 21 7 1 8 8 28 56 70 56 28 8 1 9 9 36 84 126 126 84 36 9 1 10 10 45 120 210 252 210 120 45 10 1 11 11 55 165 330 462 462 330 165 55 11 1 12 12 66 220 495 792 924 792 495 220 66 12 1 13 13 78 286 715 1287 1716 1716 1287 715 286 78 13 1 14 14 91 364 1001 2002 3003 3432 3003 2002 1001 364 91 14 1 15 15 105 455 1365 3003 5005 6435 6435 5005 3003 1365 455 105 15 1 16 16 120 560 1820 4368 8008 11440 12870 11440 8008 4368 1820 560 120 16 1 17 17 136 680 2380 6188 12376 19448 24310 24310 19448 12376 6188 2380 680 136 17 1
By inspection we see that the largest number of choices, namely 24,310, arises when a 17-unit superpower is forced to remove roughly half of his forces.
If he has more centers than units, he may build units by placing them, one in each unoccupied supply center, in his home country only (provided that such supply centers are still under his control). He must specify a fleet or an army in a coastal supply center (if Russia builds a fleet in St. Petersburg, he must specify the coast on which it is to appear, or the build is invalid).The number of builds possible will vary depending upon:
Occasionally, it may be in a countries interest to waive a build that it might otherwise be able to use to create a new unit. For example, Austria might save a build in order to build one fleet now and one fleet next year. Since our program can only look-ahead one or two phases at a time, it is unable to anticipate any advantages that waiving a build might bring next year. Thus, we do not even consider doing so unless so required by diplomatic constraints (eg. demilitarized zones) imposed by the negotiation module. Not counting waives then, Russia still has the most choices (17) when it has 2 or 3 builds.
Other than the share combinatorial challenge of tackling such a problem, there are practical implications of answer. It may be acceptable to write a diplomat that occasionally submits poor moves. However, it is unacceptable to write a diplomat that hangs for decades in certain rare situations trying to compare all the possible build combinations.
If the total search space can bounded by a reasonable number, then we can apply symplex methods to the resulting game matrix and find all of the Nash equilibria. However, if the total search space is too large, then (at least in some circumstances) only approximative methods may be practical.
So just how bad can things get?
The "obvious" solution of having two 17-unit powers each lose 8 supply centers does not work, since we must also assign 16 units to capture their centers. Also, note that a power must have 3 (or 4) units to control all of its supply center (as we assume above).
One possible solution with over 11 billion combinations is the following
From | To | Possibilities | |
---|---|---|---|
Austria | 17 | 8 | 24310 |
Germany | 1 | 2 | 2 |
France | 3 | 5 | 8 |
Turkey | 3 | 5 | 12 |
England | 3 | 5 | 12 |
Italy | 3 | 5 | 12 |
Russia | 4 | 6 | 17 |
Total | 11,426,088,960 |
Our current strategy finder calculates winter orders in two steps.
This assumption allows us to calculate the builds and removals for each country independently. For each country, we consider all the possible combinations or builds and removals, and chose the option that maximize the value of the resulting position. This value may be calculated directly by the position evaluation subprogram. Or if sufficient time is available, the values may be calculated via a recursive call to the strategy finder itself so that the diplomat can anticipate the Spring moves while planning its Winter adjustments.
However, the adjustments of one country can affect those of another.
That is, each country finds the best adjustment for its alliance under the (false) assumption that all other countries will skip their adjustment phase.
Given additional time, our strategy finder will attempt further improvements using simulated annealing. Each country is considered successively an random modifications of its adjustments are proposed until time runs out. The modifications are adopted if the resulting position is better for the country in question given the adjustments currently being considered by the six other powers. As the search progresses, we assume that the moves being considered are practically optimal, and we therefore modify smaller and smaller subsets of adjustments orders at a time.
These two phases of search could be combined as follows.
Once a country exhausts its list of possible moves, then it stops searching until another country changes its mind about its best move. There is no point considering exactly the same combination of moves two times. When it resumes searching it will consider moves in the order of their value as determined in the previous search under the theory that moves that were good under the prior circumstances are likely to still be good.
If all countries exhaust their lists, then each countries adjustment is optimal with respect to the adjustments being considered by the six other countries. We have thus found the "Nash equilibrium" we are looking for.
2. Mixed Equilibria
Occasionally, Nash equilibria involving pure strategies might fail to exist. It might be possible to detect an "infinite loop" in the above search using a hash table, and then use linear programming to find a mixed Nash equilibrium.
3. Disband and Build
The builds should be redesigned to interface with the retreats in a multipli search so that disbanding a unit and rebuilding it elsewhere is considered. This should be done whenever there are units to retreat, and no removals to consider, and at least one home supply center open. In that case, the best retreats will be recalculated in each "component" for various subsets of our units. The best builds will then be factored in, and the results compared.
In anticipating the number of builds, we will assume that any unit which can do so will capture an enemy or neutral supply center with its retreat.
More work could be done on integrating the search for retreats and builds, and especially on testing the work done so far.
For more information, feel free to contact the author at the address below, read previous articles in this series, or consult the DPP Home Page.
Note that plenty of work is left to be done, especially in testing the strategy finder, writing the negotiation module, and on the small games projects.
Danny Loeb
(loeb@delanet.com)
If you wish to e-mail feedback on this article to the author, and clicking on the mail address above does not work for you, feel free to use the "Dear DP..." mail interface.