When we study macroeconomics, we will definitely learn game theory, and learning game theory will definitely involve the "prisoner's dilemma" model. I have always believed that among all the market game models, "prisoner's dilemma" is the closest to psychology, the most revealing of good and evil, and the most embodying of collective wisdom. Choosing betrayal or cooperation is always its constant theme. ...
Let's pay attention to the current real estate market. A big "prisoner's dilemma" has appeared before our eyes. Vanke took the lead in giving up Pareto optimal solution. It is conceivable that in the current market environment, its expected Nash equilibrium is obviously a common mutual betrayal among real estate developers in the future predicament. In other words, Vanke chose to strike first in this "prisoner's dilemma". Selling all developers except it is because it expects that it will be sold if it does not sell all its partners as soon as possible. After all, this is a non-zero-sum game, and selling behavior can get very big benefits. ...
Why do I mean to use the word "betrayal" here? Looking back at the "offensive and defensive alliance" created by real estate developers in the past N years, we can understand that I have been opposed by the First Financial Channel because of "whether there are huge profits in the real estate industry" and "whether the real estate development cost should be made public". I always think that there are huge profits in real estate development enterprises (whether it is reasonable in the current market environment can be seen in the author's description of the real estate market in 2008. At that time, almost all developers unanimously denied the existence of profiteering. What's more, Mr. Pan shyly compares the cost to his wife's breasts, which fully shows that it is natural that the cost cannot be shown. Of course, there is also the voice of Vanke ... Today Vanke trampled on the "alliance" he once participated in and jumped out to use it in practice.
Can Vanke really get the best interests of the individual by selling out his companions? It can be obtained in theory, but in practice, it may be miscalculated in the end, because in all practices, the game of "prisoner's dilemma" must produce results in a closed environment with extremely asymmetric information. All prisoners' own fears and "rational decision-making" are isolated from each other, and this game is unrepeatable (created by Douglas. Hofstadter). The current market environment has been repeated more than once (Hainan, 05 national, 08 Shenzhen), and this repeated game will eventually free all participants from the predicament, and the "prisoner's dilemma" has been completely broken. ...
In addition, the most crucial point is that the government is also involved in this game. His direct participation lies in the source, that is, the supply (lease) of land, which will accelerate the breaking of the "prisoner's dilemma" because it is essentially different from all participants. The government is the designer of the whole game, the designer of the prison rules, and the person who builds and dismantles cells. It is hard for you to imagine that a "prisoner's dilemma" involving the government can be called a real "dilemma". Looking back at Vanke, among all the so-called companions betrayed by him, there is a government figure ... I may not be accurate in saying this. After all, this is not an ideal "prisoner's dilemma". Prisoners are not isolated from each other, and information is not extremely asymmetric at least in the industry. Did Vanke undertake some kind of mission?
Finally, the demand side is also facing a "prisoner's dilemma", which can be seen from "not buying a house" to "buying it by all people" in Zou Tao.
I think there are only two possibilities for real estate developers and even the real estate market to finally get out of the "prisoner's dilemma". One is how the most outstanding prisoners mentioned above act, imprison themselves or pardon the world. The other is that all prisoners wisely choose to stop the game and create cooperation in the process of repeating the game. In the past few years, they have done it. Now, maybe it's time to do it again. All they need is collective wisdom. And those cakes stolen by the seller may only temporarily prolong the seller's life, but in the end, they will not even be qualified as prisoners in the future ... just for two words, "trust"!
After that, maybe the "prisoner's dilemma" will evolve into another new model, which I call "prisoner's revenge under repeated game results" ... The two countries can have two choices in tariffs:
Raise tariffs to protect your goods. (betrayal)
Reach a tariff agreement with each other and reduce tariffs to facilitate the circulation of their respective commodities. (cooperation)
When one country does not abide by the tariff agreement for some reason and raises the tariff alone (betrayal), another country will make the same reaction (betrayal), which will lead to tariff war, and the goods of the two countries will lose each other's markets and also cause damage to their own economies (the result is betrayal). Then the two countries reached a new tariff agreement. (The result of repeated games is to find that cooperation with * * * is the most profitable. There will also be various examples of prisoner's dilemma in business activities. Take the advertising competition as an example.
The two companies compete with each other, and their advertisements influence each other, that is, if one company's advertisements are more acceptable to customers, it will take away part of the income of the other company. But if they publish advertisements with similar quality at the same time, the income will increase little but the cost will increase. But if we don't improve the quality of advertising, the business will be taken away by the other party.
The two companies can have two choices:
Reach an agreement with each other to reduce advertising costs. (cooperation)
Increase the cost of advertising, try to improve the quality of advertising and overwhelm the other party. (betrayal)
If the two companies don't trust each other and can't cooperate, and betrayal becomes the dominant strategy, the two companies will fall into an advertising war, and the increase in advertising expenses will damage the profits of the two companies, which is the prisoner's dilemma. In reality, it is difficult for two competing companies to reach a cooperation agreement, and most of them will fall into a prisoner's dilemma. In his book The Evolution of Cooperation, Robert axelrod explored the extension of the classic prisoner's dilemma and called it "repeated prisoner's dilemma" (IPD). In this game, participants must repeatedly choose their strategies related to each other and remember their previous confrontations. Axelrod invited academic colleagues from all over the world to design computer strategies and compete with each other in a repeated prisoner's dilemma competition. The differences of competition procedures widely exist in these aspects: the complexity of the algorithm, the initial confrontation, the ability to forgive and so on. Axelrod found that when these confrontations were repeated for a long time by each participant who chose different strategies, judging from the perspective of self-interest, the "greedy" strategy tended to decrease, while the "altruistic" strategy was adopted more. He used this game to illustrate that through natural selection, a mechanism of altruistic behavior may evolve from the original purely selfish mechanism.
The best deterministic strategy is called "answer blows with blows", which is a method developed by Anatol Rapoport and applied to tournaments. It is the simplest of all the entry procedures, only contains four lines of basic language, and won the competition. This strategy is just to cooperate at the beginning of the repeated game, and then adopt your opponent's last round strategy. A better strategy is "answer blows with blows". When your opponent betrays, you should cooperate with a small probability (about 1%~5%) in the next round anyway. This is due to the occasional need to recover from the deception of circular betrayal. When misinformation is introduced into the game, "forgiveness for the blow" is the best. This means that sometimes your behavior is wrongly conveyed to your opponent: you cooperated, but your opponent heard that you betrayed. Through the analysis of the high score strategy, axelrod specified several necessary conditions for the success of the strategy.
friendly
The most important condition is that the strategy must be "friendly", that is, don't betray before the opponent betrays. Almost all high-scoring strategies are friendly. Therefore, a completely selfish strategy, just for selfish reasons, will never attack the opponent first.
retaliate
However, axelrod believes that a successful strategy must not be blindly optimistic. Always taking revenge. An example of a non-retaliatory strategy is cooperation. This is a very bad choice, because the "dirty" strategy will cruelly exploit such a fool.
excuse
Another characteristic of a successful strategy is that it must be forgiven. Although they don't retaliate, if their opponents don't continue to betray, they will return to cooperation again and again. This stopped the long-term retaliation and counter-retaliation, and maximized the score.
Not jealous
The last quality is not to be jealous, that is, not to strive for a higher score than the opponent (a "friendly" strategy is inevitably not jealous, that is, a "friendly" strategy can never get a higher score than the opponent).
Therefore, axelrod came to a utopian conclusion: selfish individuals are often friendly and tolerant, and will not be jealous because of their own selfish interests. An important conclusion of axelrod's research on repeating the prisoner's dilemma is that friendly guys can complete the transaction first. Reconsider the arms race model given in the classic prisoner's dilemma section: the conclusion is that only rational strategies have enhanced military power, and it seems that both countries would rather spend their GDP on guns than butter. Interestingly, trying to prove that opposing countries actually compete in this way (under the "repeated prisoner's dilemma hypothesis", military expenditures in different periods are between "high" and "low") often shows that the hypothetical arms race did not appear as expected. (For example, the military expenditures of Greeks and Turks do not seem to follow the repeated prisoner's dilemma of "answer blows with blows", but are more likely to be driven by their domestic policies. This may be an example of different rational behaviors in one-time game and repeated game.
For the one-time prisoner's dilemma game, the best strategy (maximizing points) is simply betrayal; As mentioned earlier, it is true that no matter what the opponent's actions may be. However, in the repeated prisoner's dilemma game, the best strategy depends on the strategies of possible opponents and how they deal with betrayal and cooperation. For example, consider a group of people, each of whom betrays every time, except one who follows a tit-for-tat strategy. This man is at a slight disadvantage because he lost in the first round. In such a crowd, this person's best strategy is to betray every time. In the crowd where the total betrayer accounts for a certain proportion and the others are tit for tat, the individual's best strategy depends on this proportion and the length of the game. Bayesian Nash Equilibrium: If we can determine the statistical distribution of confrontation strategies (for example, 50% fight with the band and 50% always cooperate), then we can get the best relative strategy mathematically [4].
There was once a Monte Carlo simulation of a crowd, in which the low-scoring individuals disappeared and the high-scoring individuals were repeated (a genius algorithm for obtaining the best strategy). Algorithm synthesis in the final population usually depends on algorithm synthesis in the preliminary population.
Although answer blows with blows has always been regarded as the most reliable basic strategy, on the 20th anniversary of repeating the prisoner's dilemma, a team from the University of Southampton (led by Nicholas Jennings [1]), including Rajdeep Dash, Sarvapali Ramchurn, Alex Rogers and Perukrishnen Vytelingum, launched a new strategy, which proved to be more effective than answer blows with blows. This strategy relies on cooperation between programs and obtains the highest score for a single program. The University of Southampton submitted 60 schemes to participate in the competition. The beginning of these programs is designed to recognize each other through a set of actions of 5 to 10. Once these appraisals are made, one program will always cooperate, and other programs will always betray, so as to ensure that traitors get the maximum bonus points. If the program realizes that it is operating a non-Southampton participant, the program will continue to betray in an attempt to minimize the score of the competing program. Results [5], this strategy ended the competition by getting the top three, and also got many positions close to the bottom. Although this strategy has obviously proved to be more effective than tit-for-tat, it is because it takes advantage of the fact that multiple channels are allowed in this special competition. In the competition where one player can only control a single player, tit for tat is indeed a better strategy.
If the prisoner's dilemma is repeated exactly n times, knowing that n is a constant, then another interesting fact will appear. Nash equilibrium is betrayal every time. This is easy to prove by induction. You can also betray in the last round, because your opponent will not have a chance to punish you. So, in the last round, you will all betray. At this time, you can betray in the penultimate round, because no matter what you do in the end, your opponent will betray. And so on. In order to cooperate to keep the request, the future must be uncertain for both participants. One solution is to make the total number of games n random. Expectations for the future are definitely uncertain.
Another single example is the "endless" prisoner's dilemma. This game is repeated many times, and your score is an average (calculated by computer, of course).
The prisoner's dilemma game is the basis of some theories of human cooperation and trust. Assuming that the prisoner's dilemma can simulate the communication between two people who need trust, then the cooperative behavior of the group can be simulated by the variation of repeated games of multiple participants. This has aroused the lasting interest of many scholars. From 65438 to 0975, Grofman and Pool estimated that more than 2000 academic articles were devoted to this research.