Fingerprinting: Visualization and Automatic Analysis of Prisoner's Dilemma Strategies

D. Ashlock, E.-Y. Kim
2008 IEEE Transactions on Evolutionary Computation  
Fingerprinting is a technique for generating a representation-independent functional signature for a game playing agent. Fingerprints can be used to compare agents across representations in an automatic fashion. The theory of fingerprints is developed for software agents that play the iterated prisoner's dilemma. Examples of the technique for computing fingerprints are given. The paper summarizes past results and introduces the following new results. Fingerprints of prisoner's dilemma
more » ... that are represented as finite state machines must be rational functions. An example of a strategy that does not have a finite state representation and which does not have a rational fingerprint function is given: the majority strategy. It is shown that the AllD-and AllC-based fingerprints can be derived from the tit-for-tat fingerprint by a simple substitution. Fingerprints for four new probe strategies are introduced, generalizing previous work in which tit-for-tat is the sole probe strategy. A trial comparison is made of evolved prisoner's dilemma strategies across three representations: finite state machines, feed forward neural nets, and lookup tables. Fingerprinting demonstrates that all three representations sample the strategy space in a radically different manner, even though the neural net's and lookup table's parameters are alternate encodings of the same strategy space. This space of strategies is also a subset of those encoded by the finite state representation. Shortcomings of the fingerprint technique are outlined, with illustrative examples, and possible paths to overcome these shortcomings are given. TABLE I EXAMPLES OF PRISONER'S DILEMMA STRATEGIES. Always Cooperate(AllC) This strategy always plays C. Always Defect(AllD) This strategy always plays D. Fortress-3(Fort3) This strategy is an example of a strategy that uses a password. If the opponent defects twice in a row (the password) and cooperates thereafter, then Fortress-3 will cooperate. Any deviation from this sequence resets the need to defect twice. A minimal finite state implementation of Fortress-3 is shown in Figure 3 . Fortress-3 was first defined in [12] and is an example of a strategy that only arises after substantial evolution has taken place. Majority(Maj) This strategy returns a play equal to the majority of its opponent's plays, breaking ties in favor of cooperation. Majority has no finite state representation. Pavlov(Pav) The strategy, Pavlov, plays C as its initial action and cooperates thereafter if its action and its opponent's actions matched last time. A minimal finite state implementation of Pavlov is shown in Figure 3 . Periodic CD(PerCD) This strategy cooperates and defects on alternate moves no matter what its opponent does. Psycho(Psy) The strategy, Psycho, chooses D as its initial action and then plays the opposite of its opponent's last action. Random(Rand) The Random strategy simply flips a fair coin to decide how to play. Random has no finite state representation. Ripoff(Rip) This strategy alternates cooperation and defection until its opponent defects for the first time. On the round after this defection, it cooperates and then plays tit-for-tat thereafter. Thumper(Thmpr) This strategy cooperates initially. If its opponent defects, then it defects on the next two moves; if its opponent's second move after defection is cooperate, it continues cooperating; otherwise it defects twice as before. A minimal finite state implementation of Thumper is shown in Figure 3 . Tit-for-tat(TFT) The strategy, tit-for-tat, plays C as its initial action and then repeats the other player's last action. Tit-for-two-tats(TF2T) This strategy defects only if its opponent has defected on the last two moves. Tit-for-three-tats(TF3T) This strategy defects only if its opponent has defected on the last three moves. Two-tits-for-tat(TTFT) This strategy defects on the two moves after its opponent defects, otherwise it cooperates.
doi:10.1109/tevc.2008.920675 fatcat:53grwrsw3rewhoccrmn7b73he4