Token Model Simulation
#1 Fools Agreement Part 1: Introducing ‘Fools Agreement’ and Simulation Environment
#1 Fools Agreement Part 2 : Simulation Result Analysis
Simulation Result Analysis
In our previous post, we found out why simulation is necessary and explained how to create a simulation. In this post, we will analyze our simulation result.
Basic Mechanism (M1) Result
Before going into the basic mechanism result analysis, let’s look into the functioning method of the basic mechanism for those who are not familiar with oracle voting. The basic mechanism confiscates the tokens deposited by voters onto the losing vote option which are given to the winning voters as rewards. The ratio of how the impounded tokens are allocated among the winning voters is equitable to the amount deposited by each voter. (As a given, the winning voters receive all of their deposit back.)
For example, let’s consider a scenario in which the oracle vote turned out True in 200 to 100 and I deposited 50 tokens on True. The winning voters will split the 100 tokens deposited onto False. Since I deposited 50 out of 200 tokens on True, I will get 25% of the 100 tokens deposited on False (25 tokens) and my original 50 token deposit. Such type of mechanism is the most basic incentive mechanism used in oracle voting.
Now, let’s dig into the result of the basic mechanism (M1) that will be the criteria of analyzing the result.
According to the simulation result, the voters quickly realize that doing their research regarding the vote is advantageous and proceed in the direction of conducting research. However, once the research rate reaches 75~80%, it does not go higher. Such a phenomenon happens because the lower 80% of the token holders noticeably lower research rate than the top 20%.
Why doesn’t the lower 80% conduct research? The research rate graph per timestep shows a reversed U shape that slightly inclines at first but continues to decline. Because the research rate drops instead of steadily incresaing, it becomes saturated. The reason why such a graph is formed is because the token inequality aggravates rapidly. At first, the whales and private investors all realize the value of researching and actively study. Whales who hold a substantial amount of tokens reap most of the rewards when their vote wins.
The severity of token inequality only worsens from this point on. This is depicted on the Whale Index graph on the upper right. The Whale Index indicates the share of tokens that the voter holding the largest amount (whale) among the votes of the winning side. Taking a closer look, you can see that the Whale Index reaches almost 50% around timesteps 20~25, showing severely aggravated inequality.
Since private investors would receive a small amount of reward tokens, they would start to feel that researching ahead of votes is not worth their time. The simulation data analysis shows that the lower 20% and 10% of the winning voters receive minus rewards from timestep 15 even though they voted successfully by doing their research. Also, once the whales obtain enough shares to dictate the vote results, private investors will lose all incentives to study since they would have no effect on the vote results.
In summary, the reason why the research rate does not grow is because as the vote proceeds, the whales gain more influence and take more of the reward pie, which eliminates the motivation for other voters to vote. The inequality of wealth rendered by the whales gradually obtain more tokens by doing their research diminishes the research rate.
The problem is that whales actively conducting research is not something negative. Moreover, implementing artificial measures hindering the whales from conducting research or stripping their voting rights is inappropriate, let alone infeasible.
Then, what should be done to solve this problem? We established a hypothesis that we could increase the voters’ research rate by bestowing appropriate research incentives and effectively utilizing mechanisms to alleviate inequality, and used simulation to verify this hypothesis.
Hypothesis to Increase Research Rate
In each game, there are four possible situations that the voters could face: winning the vote after doing research, losing the vote despite doing research, winning the vote without research, or losing the vote without research. This can be depicted into a 2×2 matrix as follows. From here, we will use the matrix below to elucidate our analysis.
A proper mechanism would give positive (+) incentive to the voters (R) in order to enhance the research rate. The issue is that due to the asymmetry of information, it is impossible to know whether or not certain voters actually conducted research. The game results will only show whether the voter won or lost and cannot distinguish whether the voter won with or without research. It becomes hard to design a mechanism that only rewards the people who conducted research.
To solve this problem, we established a hypothesis that agents who have won four or more votes out of the five games. An agent could win without research once or twice thanks to luck, but winning four times or more seems to be the result of research.
Mechanism 2(M2) Hypothesis : The Rescue Mechanism
Mechanism 2 aims to alleviate the penalty on the above matrix (R, L). A portion of the tokens rewarded to the voters who won the vote would be given back to the voters who lost. Here what is noteworthy is that tokens are given back to those who have won four or more times out of the past five games.
The core idea of M2 is to lessen the blow for voters who have diligently conducted research but were unlucky. Because there is always the possibility of losing a vote even though a voter studied, levying a harsh penalty on one unfortunate loss would discourage voters to research. Whales who have gathered sufficient shares to manipulate the vote results could lead the majority vote to select the wrong option because of bad judgment. It is cruel to have voters be penalized for the misjudgment of whales just because they have low shares.
Mechanism 3(M3) Hypothesis : Emerson Mechanism
“Shallow men believe in luck. Strong men believe in cause and effect.” — Ralph Waldo Emerson
Mechanism 3 reinforces the (W, R) reward and weakens the (W, N) reward. M3 does not compensate all winning voters but only to those who have shown good progress. Specifically, the rewards are only given to voters who have won four or more times out of five.
The key idea for M3 is that instead of rewarding winners who simply got lucky and did not do research, it aims to score high research rates by giving rewards to those who have been diligent. In order to solve the problem of not being able to distinguish those who have won with research and those who have won without research, M3 perceives a voter’s 4/5 record as a type of signal.
Mechanism 4(M4) Hypothesis : High Controversity Mechanism
The high controversity mechanism is based on a slightly different premise than the aforementioned mechanisms. Here, the reward and penalty are adjusted in accordance with how controversial the vote was. Before moving forward, we will explain what ‘controversity’ means. We defined controversity as:
Controversity = H(the ratio of tokens deposited on True and False)
H(X) represents the uncertainty that a possibility variable holds in the concept of entropy. Generally, entropy if the possibility value is concentrated, there is lower entropy while an even distribution creates higher entropy. Let’s look at an example dealing with uncertainty. If the vote result shows 100:0, then we can have full confidence on the result. However, if the vote result is 51:49, we cannot be sure whether the result is right. We correlated this uncertainty with the controversity of the vote result. We calculated the entropy of the vote ratio and used it as the value of controversity.
As you can see in the results of the vanilla mechanism, 75–80% of agents vote after doing research. In real-life services, we can predict that most users, excluding a few malevolent users, will honestly vote on the right answer. Therefore, when the vote result was False, we believed that a portion of the voters chose False, deviating from the majority, which made False become the winning choice with a narrow margin. If the vote result was True, we believed that there was a high probability that the absolute majority voted correctly and rendered a dominant True vote.
We compared the controversity between when the vote results were True and False in our simulation with vanilla mechanism. The result was the density plot as shown above.
In the controversity mechanism, even agents who lost in the vote were given back tokens equal to the controversity from their deposit. Agents who won the vote were given 1-controversity tokens as rewards.
The above setting was intended to reduce the penalty level on a controversial result. The purpose was to increase the research rate by alleviating the penalty on agents who, despite their research, lost the vote.
Mechanism Result Analysis
Now, we will analyze the results of each mechanism one by one.
Mechanism2 Result Analysis
- The research rate compared to the used mechanism was meaningfully high.
- The bottom 80% also has high research rate.
- The True Ratio, which shows the rate of how much the oracle’s result was actually true, was high.
The reason for such result was that the mechanism’s feature of allowing agents who lost the vote to reclaim a portion of their token deposit diminished the level of inequality. The token holding ratio of Top 1 (the agent that holds the most amount of tokens) was low, and the voting influence of Top 1 in the winning side was also low compared to other mechanisms.
Mechanism 3 Result Analysis
- The overall research rate and research rate of the lower 80% did not improve.
- True Ratio did not improve as well.
The reason why mechanism 3 failed in improving research rate was because it did not contribute at all in alleviating inequality. The objective of mechanism 3 was to increase the research rate by maximizing the reward given to agents who have diligently done their research when they win. However, reality showed that regardless of whether the agent conducted research, the agent with more token holdings saw increased rewards and the mechanism did motivate agents to conduct research.
Mechanism 4 : Controversity Mechanism
- Mechanism 4 showed drastically lower research rate than the other mechanisms.
- Yet, the True Ratio was clearly more improved.
The controversity mechanism showed conflicting features. First, the research rate was conspicuously lower than other mechanisms. That made us assume that the True Ratio would also be lower. However, unexpectedly, the True Ratio turned out to be drastically higher than the others.
We looked into the factors of such result. The biggest difference between other mechanisms and the controversity mechanism was that the inequality of token distribution did not progress as rapidly as other mechanisms. Unlike other mechanisms where one agent would end up holding 50~60% of the tokens at timestep 50, mechanism 4 showed that one agent would only hold 40% or less. This improvement in inequality of token distribution prevented one or two agents from manipulating the vote results. Consequently, despite the relatively low research rate, we believed that mechanism 4 showed higher True Ratio because it reflected the votes of 70~80% of the agents that actually did their research.
The objective of the controversity mechanism was to lessen the level of penalty on agents for voting for the losing side when the vote ended with the incorrect result. The above figure compares the average penalty that agents would receive when the vote concluded with the correct result and the wrong result. We can confirm that agents who have voted the truth even though the vote result leaned towards False received reduced penalties.
Through the simulations we have conducted, we learned that it is crucial to design an oracle mechanism that prevents inequality. Malevolent whales who seek to use their influence to manipulate the vote results and the nice whales who truly want to contribute to the network alike can bring about detrimental consequences to the network. This is because that due to their substantial holdings, one mistake can greatly affect voting results. If it is impossible to synthetically alleviate the level of inequality, there at least needs to be a system or device set in place to check the whales.
Decon’s goal is to continue to develop simulation projects. The recent AI conference hosted by SKT and Samsung Electronics allotted more time for world renowned researchers who focus on reinforced learning. More and more machine learning and deep learning researchers are drawn to studying reinforced learning and the field is advancing quickly. Our team strives to follow such trending studies all the while conducting our own research.
Based on our research capacity, Decon plans to proceed with simulation projects in more complicated and realistic environments. We ask for your active support.
- Leslie, D. S., & Collins, E. J. (2005). Individual Q-Learning in Normal Form Games. SIAM Journal on Control and Optimization, 44(2), 495–514. http://doi.org/10.1137/S0363012903437976
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An introduction. Cambridge, MA:The MIT Press
- Macal, C. M., & North, M. J. (2010). Tutorial on agent-based modelling and simulation. Journal of Simulation. http://doi.org/10.1057/jos.2010.3