## Abstract
A revised version of the proposed research reward mechanism described in [[1]](https://steemit.com/gridcoin/@ilikechocolate/internal-hardware-optimization-hardware-profiling-database-and-dynamic-work-unit-normalization) is presented. The updates to the mechanism are in response to real-world conditions that limit the normalization process between different hardwares on different project applications. Advantages and disadvantages of the proposal are discussed, as well as the feasibility of implementation.
## Acknowledgements
In addition to everyone mentioned in the [original proposal](https://steemit.com/gridcoin/@ilikechocolate/internal-hardware-optimization-hardware-profiling-database-and-dynamic-work-unit-normalization), I would like to thank again @jamescowens, especially for conversations regarding how to store the information required to implement this proposal, as well as @cycy, especially for help navigating through BOINC's source code and providing his WU data for analysis.
## Description and Benefits of the Proposal
The proposal (both the revised and original ones) can be summarized as considering the Gridcoin network to be one massive supercomputer, and rewarding crunchers with GRC proportional to their processing power contribution.
Reasons for changing the incentive structure to this proposal include:
1. This proposal relieves the network of necessarily allocating the same amount of GRC to each project
2. As a consequence of (1) and by design, it eliminates the same type of hardware receiving vastly different rewards on different projects
3. The GRC minting mechanism can be scaled in an intelligent fashion
4. Since the normalization is constructed from hardware/project pair data, it allows us to introduce non-BOINC based distributed computing paradigms
5. The normalization ties the minting of GRC to two important units: FLOPs and Joules (computing power and energy consumption)
6. The information from (5) allows for data analysis that can improve the efficiency of the network
7. A more robust project greylisting mechanism can be implemented, since the network can more precisely measure computing power vs. available WUs
8. A much more accurate total network computing power for both Gridcoin and BOINC can be calculated
9. The possibility for a GRC/green energy exchange market where crunchers can offset their carbon emissions becomes much more realistic, as described in [[2]](https://steemit.com/gridcoin/@ilikechocolate/multi-currency-economy-and-gridcoin-as-the-world-s-largest-decentralized-green-energy-powered-supercomputer)
10. The proposal necessitates the creation of a hardware profiling database, the benefits of which are also described in [[1]](https://steemit.com/gridcoin/@ilikechocolate/internal-hardware-optimization-hardware-profiling-database-and-dynamic-work-unit-normalization)
## Revised Proposal
### Changes
The basis of the Equivalence Ratio (ER) described in [[1]](https://steemit.com/gridcoin/@ilikechocolate/internal-hardware-optimization-hardware-profiling-database-and-dynamic-work-unit-normalization) has been switched from Total Credit Delta (TCD) to Recent Average Credit (RAC), which is the current WU contribution measurement for BOINC.
The main reasons for sticking with RAC are (quoting from @jamescowens):
> 1. It [TCD] makes us much more susceptible to credit hacks, because the RAC has a charge up halflife of 7 days and so blunts a magnitude play.
> 2. Projects sometimes rollback credits for various reasons... difficult to deal properly with that.
> 3. It [TCD] introduces an instability between the SB intervals to calculate the rewards and the actual points the stats are sampled. RAC can be thought of as essentially a "rate", because it is smoothed, whereas TCD is a point in time value. If you are incorporating 6 hour old TCD's from one of the projects in the SB, you are not using an accurate basis for TCD incorporation in the SB.
The original pseudocode for the method based on TCD can be found in the Appendix - the new one is presented below. Changes from the old pseudocode include:
1. M = |H| x |A|, where H is the same, and A is the set of projects, no longer the set of project applications
2. W = |B| x |A|, where B is the same, and A is modified as above
3. N = |B| x |A|, where B is the same, and A is modified as in M and W
### Pseudocode
<center>https://steemitimages.com/700x1100/https://cdn.steemitimages.com/DQmczmNzPXTaiEXQnAQQeAoKZ5GHjMKeUtacQc6thuAsYqv/image.png</center>
### Runtime Analysis
Under realistic conditions where the rate of growth of the number of projects and number of types of hardwares are significantly outpaced by the rate of growth of the number of beacons, this algorithm runs in O(|B|) time. Ignoring the outer superblock loop:
1. the first for loop has |A| x [ (1x|H| matrix) x (|H|x1 matrix) ] calculations - a scalar times a vector product. Since the vector product is determined by |H|, which in practice will not grow enough to justify considering it a variable, the runtime is O(|A|x|H|) = O(1).
2. the second nested for loop has |B| x |A| multiplications, so the runtime is O(|B|).
3. the third for loop has |B| x |A| calculations, so the runtime is O(|B|).
Thus, the overall runtime is O(|B|) - i.e. it is linear in the number of beacons participating in the network.
## Impact of the Changes
1. Compared to the original proposal, for the everyday/consistent cruncher, the reward allocation will not be much different; in the limit of time, the rewards would be the same either way. In the Appendix is a modification of [the C++ code to calculate RAC from the old BOINC wiki](http://web.archive.org/web/20120418125739/http://www.boinc-wiki.info/Recent_Average_Credit) (also see @jamescowens' [treatment of RAC](https://www.reddit.com/r/gridcoin/comments/9on87y/how_long_does_it_take_for_my_racmag_to_level_out/e7vftj9/?context=3), or one of the many others available). It becomes clear from this simulation that the RAC of any single host/project pair as it approaches its asymptotic maximum is equal to the maximum number of credits that the host can crunch in 24 hours. In other words, there is a linear mapping between "maximum" RAC and WU/time, the latter of which is the basis for creating the ER. Thus, this new proposal based on RAC maintains that crucial aspect of the ER; however...
2. A game-theoretic issue arises out of using RAC over TCD - crunchers may be able to increase their rewards by jumping from project to project. This is currently the case as well, and it is unknown whether adopting this proposal will mitigate, exacerbate, or leave untouched this exploit. This can be analyzed if the community so desires.
3. The crunchers of projects that have comparatively lower processing power will no longer receive disproportionately higher rewards. This effectively eliminates the artificial economic incentive to crunch smaller projects. After discussing the possibilities for maintaining an artificial incentive to crunch smaller projects, @jamescowens proposed the following solution: effectively, designate a certain percentage of the network's total computing power to be evenly spread amongst all projects; if any project(s) receive(s) less than the intended amount, proportionally subtract rewards from the project(s) that exceed(s) the intended amount, and proportionally add rewards to the project(s) that received less than the intended amount. The community would decide whether or not it wants to implement this mechanism.
4. There is a slight problem that arises out of using RAC - the ER no longer normalizes credits across applications, but rather across entire projects. The granted credits of application WUs are *supposed* to be proportional to their running time within projects, but this is almost certainly not the case because 1) of the inherent capabilities of some hardware relative to others, and 2) projects may not have normalized their applications even on a single machine. This casts doubt on the accuracy of WU proportionality within a project, and it will impact the accuracy of the ER. However, after a thorough investigation, normalizing across applications is not possible at this moment, although it can be with the proper PRs put in to BOINC.
This last point warrants further discussion, and provides a useful bridge to explain how the data to construct the matrix _M_ will be collected.
## How is _M_ built?
The host statistics export files, which all projects have available, contain much of the relevant information needed to construct _M_ - see the Appendix for an example of such a file. In particular, the `expavg_credit` field is determined by the same code that was used to create the aforementioned modified RAC code. The `credit_per_cpu_sec` field is deprecated, although would provide useful information if it was not. The `expavg_credit` field in conjunction with the `p_model` field is enough to construct _M_.
While the host statistics export files contain sufficient information to construct _M_ as described in this current proposal, they are missing key information necessary to construct the matrix _M_ on a per application basis as described in the original proposal - namely, the RAC per application. After going through the BOINC source code (many thanks to @cycy for help navigating through it), it seems like it would be necessary to submit a PR to BOINC to actually be able to obtain such information.
A good example of a file that would provide such data is the WU information file provided by World Community Grid - see the Appendix for an example. Below is a histogram of the WU/time of different hardwares crunching different applications of WCG (many thanks to @cycy for providing this data):
<center>https://cdn.steemitimages.com/DQmeHkJ2wxZfpFM2zV6ughFMN7kd8uLpfxhGFDAFFejatGm/image.png</center>
As can be seen from this histogram, some sort of selection rule must be devised to determine how to choose a single number for the WU/time of a hardware/application pair (this also applies to the hardware/project pair in the current proposal).
Since the `expavg_credit` in the host statistics export files field is a reflection of the RAC, it will change over time; furthermore, if a host is not reaching its maximum RAC, this value will not reflect the maximum WU/time that is required to construct the matrix _M_. A simple approach would be to take the maximum value of `expavg_credit` for any hardware/project pair, but this is likely insufficient for a number of reasons and such a rule must be carefully chosen.
## Feasibility of Implementation
The primary piece of information that must be agreed upon by the network is the matrix _M_ described in the pseudocode. Extensive conversations with @jamescowens and @cycy concluded with the fact that this matrix can be stored, and cross-verified, using similar methods to those currently used by the new scraper for statistics collection. The exact details by which the nodes converge to the same _M_ must be discussed, but these are technical details that do not affect the viability of the proposal.
## Conclusion
An updated GRC reward mechanism would have many benefits for the network, including rewards more closely associated with actual computing power contributions, the ability to improve the efficiency of the network, and the opportunity to create a GRC/green energy exchange to offset the carbon emissions from crunchers' hardware. This proposal can also bring the Gridcoin community closer to BOINC by helping to improve BOINC's source code, as well as attract non-Gridcoin BOINCers to Gridcoin with these new features as well as new, exciting opportunities enabled by this proposal.
## References
[[1] Internal Hardware Optimization, Hardware Profiling Database, and Dynamic Work Unit Normalization](https://steemit.com/gridcoin/@ilikechocolate/internal-hardware-optimization-hardware-profiling-database-and-dynamic-work-unit-normalization)
[[2] Multi-Currency Economy and Gridcoin as the World's Largest, Decentralized, Green-Energy Powered Supercomputer](https://steemit.com/gridcoin/@ilikechocolate/multi-currency-economy-and-gridcoin-as-the-world-s-largest-decentralized-green-energy-powered-supercomputer)
## Appendix
### Pseudocode for Original Proposal
<center>https://steemitimages.com/700x1100/https://cdn.steemitimages.com/DQmU3Sgd3ERVHnhJV5Fd69qP7ACp8fZxhcifbPawPyd2Z3r/image.png</center>
### Modified RAC Code
The original code can be found [here](http://web.archive.org/web/20120418125739/http://www.boinc-wiki.info/Recent_Average_Credit). A main function and a fake time counter were added to the original to make the code suitable for simulations.
```cpp
#include <iostream>
#include <math.h>
#define SECONDS_PER_DAY 86400
using namespace std;
void update_average (
double work_start_time, // when new work was started // (or zero if no new work)
double work, // amount of new work
double half_life,
double& avg, // average work per day (in and out)
double& avg_time, // when average was last computed
double& fakeTime // new, for simulations
) {
//double now = dtime();
double now = fakeTime;
if (avg_time) {
double diff, diff_days, weight;
diff = now - avg_time;
if (diff<0) diff=0;
diff_days = diff/SECONDS_PER_DAY;
weight = exp(-diff*M_LN2/half_life);
avg *= weight;
if ((1.0-weight) > 1.e-6) {
avg += (1-weight)*(work/diff_days);
} else {
avg += M_LN2*work*SECONDS_PER_DAY/half_life;
}
}
else if (work) {
// If first time, average is just work/duration
//
cout << "avg_time = " << avg_time << "\n";
cout << "now = " << now << "\n";
double dd = (now - work_start_time)/SECONDS_PER_DAY;
avg = work/dd;
}
avg_time = now;
}
int main() {
double RAC = 0;
double timeOne = 1;
double timeTwo = 1;
double totalCredit = 0;
double timeInterval = 3600; // new; time in seconds between each RAC update
double work_start_time = 0; // when new work was started // (or zero if no new work)
double work = 200; // amount of new work
double half_life = 604800;
double& avg = RAC; // average work per day (in and out)
double& avg_time = timeOne; // when average was last computed
double& fakeTime = timeTwo; // new; for simulation
for (int i=0; i<1500; i++) {
if (1) {
if (i % 24 == 0) {
cout<<"week " << i/168 + 1 << ", day " << (i/24)%7 + 1 << "; current hour = " << i << "; ";
cout<<"totalCredit = " << totalCredit << "; ";
//cout<<"fakeTime = " << fakeTime << "; ";
cout<<"RAC = "<< RAC << "\n";
}
}
fakeTime += timeInterval;
update_average(work_start_time, work, half_life, avg, avg_time, fakeTime);
totalCredit += work;
}
cout<<"Final totalCredit = " << totalCredit << "\n";
cout<<"Final fakeTime = " << fakeTime << "\n";
cout<<"Final RAC = "<< RAC;
return 0;
}
```
### Example of a Host Statistics Export File
```xml
<host>
<id>10</id>
<userid>641059</userid>
<total_credit>823243.112985</total_credit>
<expavg_credit>0.071538</expavg_credit>
<expavg_time>1333480853.984610</expavg_time>
<p_vendor>GenuineIntel</p_vendor>
<p_model>Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz [Family 6 Model 26 Stepping 5]</p_model>
<os_name>Microsoft Windows 7</os_name>
<os_version>x64 Edition, (06.01.7600.00)</os_version>
<coprocs>[BOINC|6.10.58][CUDA|GeForce GTX 460|1|1023MB|30448]</coprocs>
<create_time>1086269299</create_time>
<rpc_time>1347796849</rpc_time>
<timezone>7200</timezone>
<ncpus>8</ncpus>
<p_fpops>2230397818.915570</p_fpops>
<p_iops>8494754137.090450</p_iops>
<p_membw>125000000.000000</p_membw>
<m_nbytes>12882857984.000000</m_nbytes>
<m_cache>262144.000000</m_cache>
<m_swap>17175875584.000000</m_swap>
<d_total>159939297280.000000</d_total>
<d_free>35758538752.000000</d_free>
<n_bwup>75689.857377</n_bwup>
<n_bwdown>107630.036610</n_bwdown>
<avg_turnaround>95433.503340</avg_turnaround>
<credit_per_cpu_sec>0.056947</credit_per_cpu_sec>
<host_cpid>1cf9941bde51c9743b1f0e63a96e74a3</host_cpid>
</host>
```
### Example of World Community Grid WU File
```json
{"ResultsStatus": {
"ResultsAvailable": "2586",
"ResultsReturned": "100",
"Offset": "0",
"Results": [
{
"AppName": "zika",
"ClaimedCredit": 75.43766926617201,
"CpuTime": 1.6173177777777779,
"ElapsedTime": 1.6184435683333331,
"ExitStatus": 0,
"GrantedCredit": 75.43766926617201,
"DeviceId": 5007958,
"ModTime": 1555951087,
"WorkunitId": 1087822805,
"ResultId": 928128208,
"Name": "ZIKA_000420711_x5k6k_ZIKV_NS1_MD_model_5_s2_0097_0",
"Outcome": 1,
"ReceivedTime": "2019-04-22T16:38:01",
"ReportDeadline": "2019-05-02T12:06:23",
"SentTime": "2019-04-22T12:06:23",
"ServerState": 5,
"ValidateState": 1,
"FileDeleteState": 0
},
...
]
}
}
```