create account

A Disappointing Optimization by ravonn

View this thread on: hive.blogpeakd.comecency.com
· @ravonn · (edited)
$8.36
A Disappointing Optimization
![](https://steemitimages.com/0x0/https://res.cloudinary.com/hpiynhbhq/image/upload/v1514795746/qvwknktbgfuyejlipei6.png)

This was originally going to be a post about the upcoming optimizations in Gridcoin. It was going to show how much faster the Camilla release will be compared to Betsy, complete with grandiose benchmarks showing that all our chain synchronization speeds would be solved and that running on a Pi would be a breeze. Sometimes you overhype and things don't go as planned.

Benchmarking
============
In order to see how much faster we are now I chose to import the latest chain snapshot using `-loadblock`. This eliminates the network fluctuations and allows me to focus on the raw processing performance. The tests were done on two identical, somewhat slow computers:

- Core2Duo 2.66GHz
- 4GB RAM
- 7200 RPM HDD

I added output information to plot the total time spent at every 50k blocks imported, launched Betsy (staging) and Camilla (development) and let them run the import.

![benchmark_1.png](https://cdn.steemitimages.com/DQmfV12ynjZC6BSPPNyLzxjceXphNcZciBxF7rDxFRkFvPs/benchmark_1.png)

Wait, what? This is mind blowing but in the bad way. With the [optimizations done](https://github.com/gridcoin-community/Gridcoin-Research/pull/1194) Camilla should have been the holy graal of chain syncs. It imports 1050k blocks around 45 minutes faster than Betsy which is nowhere near my expectations. The benchmark did however reveal one very important thing which was new to me: Block synchronization slows down over time. This becomes apparent when you plot the time it took to import each 50k block chunk.

![benchmark_2.png](https://cdn.steemitimages.com/DQmZqB6nirPdryFSbwak8sTujLrynWyReiVaCMSekb1AyPW/benchmark_2.png)

This shows a severe scaling issue and this most likely gets progressively worse. It might even explain why some people are having [issues syncing](https://github.com/gridcoin-community/Gridcoin-Research/issues/1200#issuecomment-410603887), though that is just a guess.

Investigations
==============
The behavior indicates one of two things:

 - We are using containers which do not scale well.
 - We have loops which increase in range over time.
 
Neither really fits the graph as I would expect a more linear or consistent increase in times and certainly not a decrease as we see in the ranges 800-850k and 950k-1000k, but taking a deeper look at the containers and loop is some sort of a starting point.
 
The best way to approach this is usually to profile a run to pin point where the CPU cycles go. Unfortunately, [Valgrind's Callgrind](http://valgrind.org/docs/manual/cl-manual.html) is far too slow to sync up enough for the problem to show and [gprof](https://en.wikipedia.org/wiki/Gprof) does not show anything in this case. For example, in this 6h+ run it claims that scrypt used most of the CPU cycles.

```
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ks/call  Ks/call  name    
 37.40   1188.46  1188.46  4086906     0.00     0.00  scrypt_nosalt(void const*, unsigned long, void*)
```

Since it samples at regular intervals it tends to miss things which are small but done frequently. This is a hint that the problem might lie in how we access container data. We just have to this manually.
 
Containers
----------
The two most frequently containers accessed or modified are the ones which hold the block indices and the [application cache](https://github.com/gridcoin-community/Gridcoin-Research/blob/ba0bd8effb38ea1a5d6a2a27f923c0fe86de67be/src/appcache.h).  The block index map was recently [changed](https://github.com/gridcoin-community/Gridcoin-Research/issues/648) from a sorted map to `std::unordered_map`. This container is faster on access if you can generate cheap hashes, which we can, but maybe it does not scale as well. Since both Core2Duo computers were busy testing other solutions I ran a quick test on my workstation where I switched the container back to `std::map` and saw the exact same time graph, only with reduced times.
 
The second test was to switch the application cache to using `unordered_map` as well. 
 
![benchmark_3.png](https://cdn.steemitimages.com/DQmWje65xb1eABuBtEJvkFpABF7Rg7d6rtDXnoPvoqTqKuN/benchmark_3.png)
 
Ok. An improvement to the overall speed of the application but it has the same problem as before.
 
Escalating loops
-----------------
Most of the loops either have such a small range for them to be significant, or they have a long but constant range. One loop which stands out is the [tally loop in GetLifetimeCPID](https://github.com/gridcoin-community/Gridcoin-Research/blob/886d9c0f7c60c5bdd99fa56b7044940fe5f7cdac/src/main.cpp#L5720). What the loop does is to go through each block staked by a certain CPID and collect statistics such as interest and PoR collected, total magnitude and stake times. This is done for [every connected PoR block](https://github.com/gridcoin-community/Gridcoin-Research/blob/886d9c0f7c60c5bdd99fa56b7044940fe5f7cdac/src/main.cpp#L3918) which definitely would count as an escalating loop. The longer the chain the more blocks staked and the more blocks have to be iterated over.

We still need to be able to do this when disconnecting blocks but when connecting blocks we can focus on just tallying those particular block(s) on top of the previous tally numbers. It scales better and performs better.

![benchmark_4.png](https://cdn.steemitimages.com/DQmTKRymnonkpYvbzkHxYZQuBZftuHk4tpLzHBsTVaRBXGr/benchmark_4.png)

Again, a tiny bit faster, still slow overall.

Monitoring appcache
-------------------
Let's go back to the application cache to see when and with what it gets populated during a chain sync.

Height | beacons | altbeacons | currentneuralsecurity
-|-|-|-
0|0|0|0
50k|0|0|0
100k|0|0|0
150k|0|0|0
200k|0|0|0
250k|1|1|0
300k|2|2|0
350k|4|4|0
400k|668|765|0
450k|1138|1324|0
500k|1512|1744|0
550k|2219|2532|785
600k|2448|3010|1093

Now we are getting somewhere. The slowdowns start at 500k and really ramps up after 550k. This coincides with both the altbeacons and the neural security hash count. Let's have a look at where these entries are used to see if there's a connection.

The neural security entries stem from the `ComputeNeuralNetworkSupermajorityHashes` function which gathers all the superblock votes cast by the stakers. The function also populates additional data structures with the collected votes. To begin with these structures are ordered maps and we most often do not care about the ordering, so converting them to `unordered_map` and adding ordering code where needed improves the performance. There are also some odd and very slow constructs when inserting or modifying container data:

```
if (mvCurrentNeuralNetworkHash.size() > 0)
    temp_hashcount = mvCurrentNeuralNetworkHash[NeuralHash];

// ...

if (temp_hashcount == 0)
    mvCurrentNeuralNetworkHash.insert(map<std::string,double>::value_type(NeuralHash,0));

// ...

temp_hashcount += votes;
mvCurrentNeuralNetworkHash[NeuralHash] = temp_hashcount;
```

These types of checks are both redundant and extremely inefficient. For each time this code is run it has to do three map lookups using the same key. All of this can be replaced with:

```
mvCurrentNeuralNetworkHash[NeuralHash] += votes;
```

which does one single lookup and inserts a default value of 0 if missing.

Additionally we keep populating the appcache with "neuralsecurity" entries when we should clear them before collecting new ones. Without such a clear it will keep growing until the rules change in v9.

![benchmark_5.png](https://cdn.steemitimages.com/DQmbXbZabkfD8SnPbm2zimt1ciqE9wJSz6bRCxHKD8cbFBU/benchmark_5.png)

There we go! Now we actually have an acceptable and consistent block acceptance perfomance. The process is still quite slow at reorganizing which might explain the v8 bump. When importing a snapshot it seems to still process the blocks as they arrived which would explain why there were a lot of reorgs around the v8 switch. We can work on that and improve the reorg performance.

Revisiting appcache
----------------------
At this point we could start wrapping things up and call it a day. I had one more thing I wanted to test out though, and that was to flatten the application cache which, despite its name, is not a cache but a global key/value store.

The cache is stored as a map of maps. You have the outer map which serves as cache sections. Beacons, projects, polls. That type of grouping.  This means that every time you want to access a certain cached item you first have to do an additional lookup just to get to the section which contains the it. Since we know the actual cache sections at compile time we can hard code the caches and have them accessible by an identifier instead of a string. The access is changed from

```
ReadCache("beacon", cpid);
```

to
```
ReadCache(Section::BEACON, cpid);
```

It's less flexible but faster and type safe.

![benchmark_6.png](https://cdn.steemitimages.com/DQmc5A5peBD6W8B67eZX71W4eGU3F2eYQgJTSQ2YjcBkUnn/benchmark_6.png)

Nothing spectacular. It saves around 10 minutes on my systems which is better than nothing, I suppose. It's always good to reduce the cost of a key/value store, especially considering how often we access it.

Results
=======
![results.png](https://cdn.steemitimages.com/DQmeegmCpfRvVFZN2JYdJc819oCbEDSXKzBLLNVYkuh8GL5/results.png)

The chain can now be imported and synced at an acceptable speed. We can continue focusing on the reorg speeds but this will do for now.

There are some important lessons to take from all of this:

- Don't fully trust the profiler, especially not if it's a sampling profiler.
- Pick your containers very carefully.
- Use your containers cautiously.

Now I can write that article.
πŸ‘  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and 23 others
properties (23)
authorravonn
permlinka-disappointing-optimization
categorygridcoin
json_metadata{"tags":["gridcoin","development","cryptocurrency"],"image":["https://steemitimages.com/0x0/https://res.cloudinary.com/hpiynhbhq/image/upload/v1514795746/qvwknktbgfuyejlipei6.png","https://cdn.steemitimages.com/DQmfV12ynjZC6BSPPNyLzxjceXphNcZciBxF7rDxFRkFvPs/benchmark_1.png","https://cdn.steemitimages.com/DQmZqB6nirPdryFSbwak8sTujLrynWyReiVaCMSekb1AyPW/benchmark_2.png","https://cdn.steemitimages.com/DQmWje65xb1eABuBtEJvkFpABF7Rg7d6rtDXnoPvoqTqKuN/benchmark_3.png","https://cdn.steemitimages.com/DQmTKRymnonkpYvbzkHxYZQuBZftuHk4tpLzHBsTVaRBXGr/benchmark_4.png","https://cdn.steemitimages.com/DQmbXbZabkfD8SnPbm2zimt1ciqE9wJSz6bRCxHKD8cbFBU/benchmark_5.png","https://cdn.steemitimages.com/DQmc5A5peBD6W8B67eZX71W4eGU3F2eYQgJTSQ2YjcBkUnn/benchmark_6.png","https://cdn.steemitimages.com/DQmeegmCpfRvVFZN2JYdJc819oCbEDSXKzBLLNVYkuh8GL5/results.png"],"links":["https://github.com/gridcoin-community/Gridcoin-Research/pull/1194","https://github.com/gridcoin-community/Gridcoin-Research/issues/1200#issuecomment-410603887","http://valgrind.org/docs/manual/cl-manual.html","https://en.wikipedia.org/wiki/Gprof","https://github.com/gridcoin-community/Gridcoin-Research/blob/ba0bd8effb38ea1a5d6a2a27f923c0fe86de67be/src/appcache.h","https://github.com/gridcoin-community/Gridcoin-Research/issues/648","https://github.com/gridcoin-community/Gridcoin-Research/blob/886d9c0f7c60c5bdd99fa56b7044940fe5f7cdac/src/main.cpp#L5720","https://github.com/gridcoin-community/Gridcoin-Research/blob/886d9c0f7c60c5bdd99fa56b7044940fe5f7cdac/src/main.cpp#L3918"],"app":"steemit/0.1","format":"markdown"}
created2018-08-09 08:28:06
last_update2018-08-13 17:11:39
depth0
children10
last_payout2018-08-16 08:28:06
cashout_time1969-12-31 23:59:59
total_payout_value6.426 HBD
curator_payout_value1.930 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length9,809
author_reputation1,551,172,951,761
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,616,404
net_rshares6,308,300,246,508
author_curate_reward""
vote details (87)
@alpinegiant ·
Keep up the great work! The community always appreciates these updates.
properties (22)
authoralpinegiant
permlinkre-ravonn-a-disappointing-optimization-20180809t164731433z
categorygridcoin
json_metadata{"tags":["gridcoin"],"app":"steemit/0.1"}
created2018-08-09 16:47:30
last_update2018-08-09 16:47:30
depth1
children0
last_payout2018-08-16 16:47:30
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length71
author_reputation404,751,626,982
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,662,124
net_rshares0
@barton26 ·
$0.04
Fantastic summary of all the work you've been doing on optimization ravon!  However, this post is not truly complete without the obligatory image of Camilla the Chicken: 

![](https://cdn.steemitimages.com/DQmbbQyc184jXMFwH2ZXLcQVfYYhirhicmXR54QSByQSHrE/image.png)

Keep up the great work :)
πŸ‘  , , , , , , , , , , ,
properties (23)
authorbarton26
permlinkre-ravonn-a-disappointing-optimization-20180809t085001375z
categorygridcoin
json_metadata{"tags":["gridcoin"],"image":["https://cdn.steemitimages.com/DQmbbQyc184jXMFwH2ZXLcQVfYYhirhicmXR54QSByQSHrE/image.png"],"app":"steemit/0.1"}
created2018-08-09 08:50:00
last_update2018-08-09 08:50:00
depth1
children0
last_payout2018-08-16 08:50:00
cashout_time1969-12-31 23:59:59
total_payout_value0.032 HBD
curator_payout_value0.004 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length291
author_reputation3,089,378,353,442
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,618,044
net_rshares27,964,509,999
author_curate_reward""
vote details (12)
@cm-steem ·
$0.04
Excellent post @ravonn! Keep up the amazing work & these cool posts :D

Great to see such a dramatic reduction in import times! πŸ‘
πŸ‘  , , , ,
properties (23)
authorcm-steem
permlinkre-ravonn-a-disappointing-optimization-20180810t105458335z
categorygridcoin
json_metadata{"tags":["gridcoin"],"users":["ravonn"],"app":"steemit/0.1"}
created2018-08-10 10:54:57
last_update2018-08-10 10:54:57
depth1
children0
last_payout2018-08-17 10:54:57
cashout_time1969-12-31 23:59:59
total_payout_value0.035 HBD
curator_payout_value0.008 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length129
author_reputation58,522,774,254,119
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,743,401
net_rshares34,999,190,027
author_curate_reward""
vote details (5)
@jamescowens ·
Not so disappointing in the end! :)
πŸ‘  , , ,
properties (23)
authorjamescowens
permlinkre-ravonn-a-disappointing-optimization-20180809t151503477z
categorygridcoin
json_metadata{"tags":["gridcoin"],"app":"steemit/0.1"}
created2018-08-09 15:15:03
last_update2018-08-09 15:15:03
depth1
children0
last_payout2018-08-16 15:15:03
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length35
author_reputation2,842,775,752,710
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,652,948
net_rshares13,821,517,134
author_curate_reward""
vote details (4)
@ravonn ·
$0.04
Worth noting here is that while I only did an import, you are going to see an equal reduction in sync times as well. Syncing 0-1M over the network should be 13h faster when/if the PRs are accepted.
πŸ‘  , , , ,
properties (23)
authorravonn
permlinkre-ravonn-a-disappointing-optimization-20180810t120827129z
categorygridcoin
json_metadata{"tags":["gridcoin"],"app":"steemit/0.1"}
created2018-08-10 12:08:27
last_update2018-08-10 12:08:27
depth1
children0
last_payout2018-08-17 12:08:27
cashout_time1969-12-31 23:59:59
total_payout_value0.030 HBD
curator_payout_value0.007 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length197
author_reputation1,551,172,951,761
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,748,899
net_rshares30,306,169,876
author_curate_reward""
vote details (5)
@steemitboard ·
Congratulations @ravonn! You have completed the following achievement on Steemit and have been rewarded with new badge(s) :

[![](https://steemitimages.com/70x80/http://steemitboard.com/notifications/voted.png)](http://steemitboard.com/@ravonn) Award for the number of upvotes received

<sub>_Click on the badge to view your Board of Honor._</sub>
<sub>_If you no longer want to receive notifications, reply to this comment with the word_ `STOP`</sub>



> Do you like [SteemitBoard's project](https://steemit.com/@steemitboard)? Then **[Vote for its witness](https://v2.steemconnect.com/sign/account-witness-vote?witness=steemitboard&approve=1)** and **get one more award**!
properties (22)
authorsteemitboard
permlinksteemitboard-notify-ravonn-20180813t212447000z
categorygridcoin
json_metadata{"image":["https://steemitboard.com/img/notify.png"]}
created2018-08-13 21:24:45
last_update2018-08-13 21:24:45
depth1
children0
last_payout2018-08-20 21:24:45
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length675
author_reputation38,975,615,169,260
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id68,098,607
net_rshares0
@steemitboard ·
Congratulations @ravonn! You have received a personal award!

[![](https://steemitimages.com/70x70/http://steemitboard.com/@ravonn/birthday2.png)](http://steemitboard.com/@ravonn)  2 Years on Steemit
<sub>_Click on the badge to view your Board of Honor._</sub>


**Do not miss the last post from @steemitboard:**
<table><tr><td><a href="https://steemit.com/steemfest/@steemitboard/the-new-steemfest-award-is-ready"><img src="https://steemitimages.com/64x128/https://cdn.steemitimages.com/DQmeEYkuDHNp3c9dC6Q5s8Wysi8DrXR89FHAFiu5XoQW8Vr/SteemitBoard_header_Krakow2018.png"></a></td><td><a href="https://steemit.com/steemfest/@steemitboard/the-new-steemfest-award-is-ready">The new SteemfestΒ³ Award is ready!</a></td></tr><tr><td><a href="https://steemit.com/steemfest/@steemitboard/i06trehc"><img src="https://steemitimages.com/64x128/https://ipfs.io/ipfs/QmU34ZrY632FFKQ1vbrkSM27VcnsjQdtXPynfMrpxDFJcF"></a></td><td><a href="https://steemit.com/steemfest/@steemitboard/i06trehc">Be ready for the next contest!</a></td></tr></table>

> Support [SteemitBoard's project](https://steemit.com/@steemitboard)! **[Vote for its witness](https://v2.steemconnect.com/sign/account-witness-vote?witness=steemitboard&approve=1)** and **get one more award**!
properties (22)
authorsteemitboard
permlinksteemitboard-notify-ravonn-20181105t170020000z
categorygridcoin
json_metadata{"image":["https://steemitboard.com/img/notify.png"]}
created2018-11-05 17:00:21
last_update2018-11-05 17:00:21
depth1
children0
last_payout2018-11-12 17:00:21
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length1,244
author_reputation38,975,615,169,260
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id74,734,574
net_rshares0
@theissen · (edited)
$0.04
These improvements are just awesome!
I also like the overall "story-style" of this post
πŸ‘  , , , , , , ,
properties (23)
authortheissen
permlinkre-ravonn-a-disappointing-optimization-20180809t084225543z
categorygridcoin
json_metadata{"tags":["gridcoin"],"app":"steemit/0.1"}
created2018-08-09 08:42:24
last_update2018-08-09 08:42:42
depth1
children0
last_payout2018-08-16 08:42:24
cashout_time1969-12-31 23:59:59
total_payout_value0.041 HBD
curator_payout_value0.003 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length87
author_reputation1,847,287,863,506
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,617,472
net_rshares34,492,394,851
author_curate_reward""
vote details (8)
@treeplanter ·
Thanks for your donation
<center><h3>You just planted 0.13 tree(s)!</h3>
Thanks to @barton26 
<h3>We have planted already 3346.59 trees
 out of 1,000,000<h3>
Let's save and restore Abongphen Highland Forest
in Cameroonian village Kedjom-Keku!
Plant trees with @treeplanter and get paid for it!
My Steem Power = 19471.62
Thanks a lot!
 @martin.mikes coordinator of @kedjom-keku
![treeplantermessage_ok.png](https://steemitimages.com/DQmdeFhTevmcmLvubxMMDoYBoNSaz4ftt7PxktmLDmF2WGg/treeplantermessage_ok.png)</center>
πŸ‘Ž  
properties (23)
authortreeplanter
permlinkre-ravonn-a-disappointing-optimization-20180809t125247965z
categorygridcoin
json_metadata{}
created2018-08-09 12:52:48
last_update2018-08-09 12:52:48
depth1
children0
last_payout2018-08-16 12:52:48
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length490
author_reputation62,929,728,687,402
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,637,893
net_rshares-1,785,178,209
author_curate_reward""
vote details (1)
@tts ·
To listen to the audio version of this article click on the play image.
[![](https://s18.postimg.org/51o0kpijd/play200x46.png)](http://ec2-52-72-169-104.compute-1.amazonaws.com/ravonn__a-disappointing-optimization.mp3)
Brought to you by [@tts](https://steemit.com/tts/@tts/introduction). If you find it useful please consider upvoting this reply.
properties (22)
authortts
permlinkre-a-disappointing-optimization-20180809t094157
categorygridcoin
json_metadata""
created2018-08-09 09:41:57
last_update2018-08-09 09:41:57
depth1
children0
last_payout2018-08-16 09:41:57
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length346
author_reputation-4,535,154,553,995
root_title"A Disappointing Optimization"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id67,622,046
net_rshares0