Danny Loeb
The Diplomacy Programming Project
The
Observation
Module
In the S1995M issue of Diplomatic Pouch,
we described the functioning
of the strategy finder, that is the strategic code on which our
Bordeaux Diplomat is based.
This lower brain of the diplomat does all the calculations and thus
forms the basis for the decisions taken by the upper brain, or
negotiator part of the diplomat.
The negotiator does not know the details of the map, nor the rules of
the game. On the other hand, it must be able to communicate with the
other players, distinguish between friends and enemies, form
alliances, and so on. To make decisions, the negotiators poses
concrete questions to the strategic core, and following the answers to
these questions, it choses the appropriate moves to submit, or
messages to send out.
The strategic core does not know what country the diplomat is playing.
It is simply a disinterested, objective observes who calculates the
best moves countries given a certain number of constraints
along with the value of the resulting position of all countries.
The negotiator determines these diplomatic constraints in two ways:
- by negotiating with the other players, and
- by observing the moves made on the board
In this article, we will discuss the observation module which is
responsable for this second task. This module has been partially
implemented thanks to Arnaud Moulard and Christophe
Moustier.
The program was written in LCS (a version of SML developped in
Toulouse for parallel processing). It takes as input the adjudicated
moves from a single season, and a friendliness matrix. The program
output a revised friendliness matrix. This 7 by 7 matrix of real
numbers indicates the
love/hate relationship between each pair of players. Positive values
indicate friendliness and negative values indicate animosity.
It is to be noted that "trustworthiness" is tracked with an
independant data structure called StabLimit. StabLimit is the observed
minimum gain required for a country to violate its agreements. For
example, in the game DipPouch, England stabbed France by opening to
the English Channel. If the diplomat judged that such a move gained
little for England, then he would be given a low StabLimit. Thus, even
if England finally did make peace with France, the program would
continue to anticipate that a stab would be made under similar
circumstances.
The following factors are used to update the Friendliness Matrix.
(Factors not yet implemented are in paratheses.)
- Violation of Specific Agreements. Agreements not to
occupy a certain space DMZ ..., to make certain moves
XDO ..., or to not make certain moves NOT(XDO
...) can be automatically verified. A bonus
is given in case of respect. In case of
violation, a penalty is given,
(and the "ambassador" process for the
nation in question is signalled so that it can ask for an
explanation. The bonus achieved by stabbing or ignored by
refusing to stab are computed, and stored as STABLIMIT data
while updating friendliness accordingly.)
- (Violation of General Agreements such as "peace"
PCE or absence of agreement in the case of a
silent partner. Use strategy finder to check if moves
correspond better to those where
countries are allied and to those where country is not allied.)
- Coordinated Moves. If a country supports or convoys the
units of another country SUP, CNV, CTO, this indicates
that the two countries are probably coordinating their moves,
and a bonus is awarded.
If a support or convoy was refused NSO, then this may
indicate a possible stable, and a small penalty is given.
- Attacks. Penalties are given for all conflicts,
according to whether support was CUT, units bounced
BNC, or the attacked totally failed FLD. An
especially big penalty is given if the defender is forced to
retreat RET.
- Movement. If an attempt is made to move, convoy, retreat, build
or destroy a unit of player X, then compare distance to the
nearest supply center of player Y with the attempted
supply center. The distance is calculated according to whether
the unit is an army of a fleet. In the case of units being
built or destroyed, the distance is considered "infinite" for a
non-existant unit.
This change in distance is filtered (0-1 is a big difference,
4-5 hardly counts at all) and reduced by 50% if move failed
FLD or bounced BNC.
This filter function with first derivative negative, second
positive, serves to weight the distance between the unit and
player in question. The larger the distance, the less the
importance of that unit.
We weight successful advances more heavily than unsuccessful
attacks.
- History. To take into account the history of the game,
we add in a 70% of the previous friendliness matrix.
- (Size. The fact that builds are always considered
agressive and removal are always considered friendly will allow
us to concentrate on our most dangerous opponents. However,
additional penalties must be given to any country much closer
to victory than the other country. This could be measured by
number of supply centers, or better by how far many moves would
be needed to capture 18 under ideal circumstances.)
- Symmetry The friendliness is stored in a matrix F.
F is replaced with F + F-transpose. This represents the fact
that friendliness is a symmetrical relationship. The
relationship between X and Y is the same as that between Y and
X.
- Transitivity. F is then replaced with F plus a small
multiple of F^2. This means that the friends of our
friends and the enemies or our enemies have a tendancy to be
our friends, and the friends of our enemies and the enemies of
our friends have a tendancy to be our enemies.
We thus calculate the friendliness matrix.
We use so many different critirea so that it is as difficult as
possible to fake true friendship or true war.
Alliance thresholds are determined from this friendliness matrix in
order to divide the seven countries into about 4 "blocs" or
"alliances". These
blocs are used as parameters of the strategy
finder in its calculations.
The rules allow 15 minutes for negotiation after a movement phase.
Our observation module requires about 15 seconds to calculate the new
matrix leaving the bulk of the available time to negotiate and
calculate new orders.
Danny Loeb
Universite de Bordeaux I
(loeb@delanet.com)
If you wish to e-mail feedback on this article to the author, and clicking
on the mail address above does not work for you, feel free to use the "Dear
DP..." mail interface, which is located
here....