Accidentally Cheating at Backgammon

Why players perceive unfairness in Backgammon software

On a regular basis, inexperienced Backgammon players voice opinions about how a certain computer program or game server cheats.  The stochastic nature of the game lends itself to this kind of perception on the part of human beings.  Generally, there are a few primary reasons for this type of belief.

First, novice players often fail to recognize the complexities of the game of Backgammon, so what they perceive as an unnatural number of “lucky rolls” are not (necessarily) due to luck, but rather due to skillful play on the part of the opponent.  Expert players tend toward positions where a greater number of rolls would be considered good (i.e., “lucky”).  A higher percentage of good moves tends to make the dice appear biased in ones favor, and it is also key to good checker play.

In many cases, players also fail to understand the nature of truly random numbers.  It is often stated that, say, a certain number of doubles in a row indicates…  excuse me…  “proves” that the virtual dice are unfair when, in fact, a truly random number generator would have to produce any arbitrary sequence (whether or not a pattern is perceptible) given enough rolls.  Of course, we are talking about pseudo-random number generators (PRNG), so they are, by their very nature, not truly random.  However, one would have to do an actual study/count of the dice rolls to make any conclusion about any particular PRNG.

The reason for this need to analyze a PRNG scientifically, rather than anecdotally, seems fairly obvious.  Human beings have selective memory, which means that we tend to recall things that are out of the ordinary, so a number of doubles in a row stands out, whereas a statistically identical sequence of rolls that do not seem to show a pattern are not reported.  Likewise, a few very good (or very bad) rolls are more memorable than many run-of-the-mill rolls.

Related to this is the concept of apophenia, which is the human “experience of seeing patterns or connections in random or meaningless data.” [from Wikipedia]  Our minds have evolved to recognize patterns, so we can sometimes perceive things that are not there.  This is how people see images in clouds, hear music or sounds in white noise, and imagine divine imagery in oil stains or burnt toast.

All of these factors make it very easy for an average person to perceive unfairness in Backgammon software or servers (even in games against other human beings), and even trained experts can be fooled.

How experts demonstrate that Backgammon software is fair

There are a few key points that are usually made by experts when arguing that a particular Backgammon program does not cheat.  First, of course, one generally describes some of the aspects of the perception problem, as listed above.  In particular, reports are almost always anecdotal, so they can be dismissed quickly as having no scientific validity until somebody does an actual count and statistical analysis.

To dismiss accusations of manipulated dice (by software), the suggestion is to manually input dice rolls, which most (decent) programs allow, according to the rolls of physical dice recorded meticulously, or by changing to an alternative PRNG.  If the results stay statistically consistent, that argues against the idea that the rolls are manipulated.  Another common argument is that programs can “look ahead” to see which rolls are upcoming and make moves based on this prior knowledge, and manual input of dice rolls also removes this possibility.

Another method to test if dice rolls are being artificially manipulated is to switch sides and look for discrepancies.  In other words, start a game (or save one in progress) with a particular random number seed and play the rest of the game, recording the dice rolls for each side.  Then, restart (or load) the game and play the opposite side.  If the dice rolls remain the same, then no manipulation was done to bias the outcome.

A final, less scientific, approach is the simple “Why?” method, wherein one looks at the reasons why (and how) a programmer might decide to write a biased program.  Speaking as the primary programmer for MVP Backgammon Professional, from MVP Software, I can assure you that cheating would add a whole extra layer of (unwanted and unnecessary) complexity, so I certainly did not and would not include such code.  In fact, accusations of unfairness were troubling enough to MVP for the first version of the program that our version has a replaceable PRNG library so one can write ones own (with whatever extra checking is desired).

Possibility for Backgammon software to cheat without malice aforethought

This whole topic was reinvigorated when yet another thread appeared on recently, entitled “Jellyfish.  Cheating or just Lucky” [links to Google groups].  Through dozens of messages, some people suggested/argued that the Backgammon program Jellyfish seemed to cheat, while two other popular programs, GNU Backgammon and Snowie, did not.

Interestingly (and, n.b., anecdotally), when testing MVP Backgammon, I had a similar experience.  I was simply testing relative strength with a series of 25-point matches between my program and these others.  Whereas the strength of my neural network was comparable to the others, it got beaten significantly by Jellyfish when it rolled the dice.  When MVPBG rolled, it was much closer.  As a final test, I played one match with manual rolls, and it was again close.  At this point, I figured out the likely problem (leaving alive the possibility that it was just sheer chance).

The whole purpose of a neural network is to discover connections and patterns in provided data, and the conclusions are affected by the design of the inputs (essentially, which raw data is supplied) and, of course, the requested output(s).  In our design, we basically supplied the number of checkers on each point (in a special format), the number on the bar, and the number borne off.  This specifies a pure position in the game (with no knowledge about moves or rolls), and our outputs were designed to estimate the probability of each potential game outcome (win, loss, or winning/losing either a gammon or backgammon).  The neural network was only used for evaluation; the selection of moves was based on the evaluation of the resulting position (and cube decisions were calculated mathematically from the neural network outputs).

Theoretically, we could provide irrelevant inputs (e.g., outside temperature) and during training, their influence on the network would tend toward zero.  However, providing somewhat related data, such as the last game move, could give the neural network just enough information to begin to anticipate an outcome and bias the outputs.  More directly, providing the current dice roll, or perhaps designing the neural network to rate individual moves based on that roll, gives the network additional information that could be used to actually predict the next pseudo-random roll, especially if the particular PRNG is not very good.  After all, guessing what the next roll would be based on the position and previous roll is exactly the kind of task that neural networks are designed to solve.

Based on this observation, I suggest that it is possible that the programmers of Jellyfish may have inadvertently, and with no malicious intent whatsoever, provided their neural network with just a little too much information, and it may have taken that information to (at least partially) figure out the random number sequence and then draw conclusions that were not intended.

This would be a very interesting (and perhaps slightly startling) example of emergent behavior in a computer system.  It would, however, explain why a program could pass all of the tests to “prove” it is not cheating, but still have an observable bias when using its own dice.  I suppose we could call it “computer intuition“.  Of course, without more scientific study, it could still just be called “luck“.