create account

Part 21: Use Multi Threading To Analyse The Steem Blockchain In Parallel by steempytutorials

View this thread on: hive.blogpeakd.comecency.com
· @steempytutorials · (edited)
$16.97
Part 21: Use Multi Threading To Analyse The Steem Blockchain In Parallel
<center>![steem-python.png](https://res.cloudinary.com/hpiynhbhq/image/upload/v1515886103/kmzfcpvtzuwhvqhgpyjp.png)</center>

This tutorial is part of a series where different aspects of programming with `steem-python` are explained. Links to the other tutorials can be found in the curriculum section below. This part is a direct continuation on [Part 19: Analysing The Steem Blockchain From A Custom Block Number For A Custom Block Count](https://steemit.com/utopian-io/@steempytutorials/part-19-analysing-the-steem-blockchain-for-a-custom-block-number-for-a-custom-block-count). Where the previous part focussed on how to access blocks on the `Steem Blockchain` 1 by 1 this part will look how this process can be parallelised.

---

#### What will I learn

- Which data is suitable for parallelisation
- Divide work between threads
- How does a thread class work
- Create threads
- Prevent data corruption with a queue and lock
- Merge data received from threads


#### Requirements

- Python3.6
- `steem-python`

#### Difficulty

- Intermediate

---

### Tutorial

#### Setup
Download the file from [Github](https://github.com/amosbastian/steempy-tutorials/tree/master/part_21). There is 1 file `multi_threaded.py` which contains the code. The file takes 2 arguments from the command line which sets the amount of `blocks` to analyse and how many `threads` to use.

Run scripts as following:
`> python multi_threaded.py 1000 8`

####  Which data is suitable for parallelization?
`Parallelisation` is great to improve efficiency and use all the cores in current CPUs. However, not all data is equal. The most optimal data for `parallelisation` is data that is not related to each other and can be combined in infinite ways without altering the outcome.

In this case we will be analysing blocks from the `Steem Blockchain` and counting how many times each `operation` is used. For this example it does not matter at which `block` the counting is started, or in which sequence the blocks are counted. As long as all `blocks` are counted the end result will be the same. Therefor, this is perfect for `parallelisation`.

#### Divide work between threads
For optimal performance work has to be divided equally. The amount of work per thread `n` can be calculated by taking the total `block_count` and dividing this by the `amount_of_threads`. For this to work `n` has to be a round number, so choose the `block_count` and `amount_of_threads` accordingly. 

```python
block_count			= int(sys.argv[1])
amount_of_threads	= int(sys.argv[2])

n 					= int(block_count/amount_of_threads)
```

Each thread has it's own `start_block` and `end_block`. To prevent overlap, since the first block is also counted. 1 has to be subtracted from `n`.

```python
start = initial_value

for each thread:
	   start_block = start
		 end_block	 = start + n -1
	   start = start + n
```
<br>

#### How does a thread class work
To make threads a theading.thread class has to be created. Consisting of an `__init__` part and a `run()` function. The `__init__` section contains all unique and shared variables that the thread requires.

```python
class myThread (threading.Thread):
	def __init__(self, thread_id, start_block, end_block, n, blockchain, workQueue, queueLock):
		threading.Thread.__init__(self)
		self.thread_id 		= thread_id
		self.start_block 	= start_block
		self.end_block 		= end_block
		self.n 				= n
		self.blockchain 	= blockchain
		self.stream			= self.blockchain.stream_from(start_block=start_block, end_block=end_block)
		self.current_block	= self.start_block
		self.workQueue	= workQueue
		self.queueLock		= queueLock

		print (self.thread_id, self.start_block, self.end_block, '\n')
```

The `run()` function is used to make the thread do stuff and is called automatically. 

```python
def run(self):
		data = {}
		for post in self.stream:
			if post['block'] != self.current_block:
				# Do stuff
```

#### Create threads
Create a list for the `threads`. Create each `thread` with it's unique and shared variables. Start the thread and append it to the list. 

```python
threads = []

for x in range(0, amount_of_threads):
	 thread = myThread(x, start, start+ n-1, n, blockchain, workQueue, queueLock)
	 thread.start()
	 threads.append(thread)
	 start = start + n
```

#### Prevent data corruption with a queue and lock
The code is set up in such a way that each thread does all it own computations. Then when it is done it adds its data to a `queue` for the main thread to retrieve from. Since it is possible that multiple threads finish at the same time a locking mechanise is required to prevent data corruption.

```python
# variables
queueLock = threading.Lock()
workQueue = queue.Queue(amount_of_threads)

# locking/unlocking sequence
self.queueLock.acquire()
self.workQueue.put(data)
self.queueLock.release()
```
#### Merge data received from threads
The main threads waits for all the threads to finish to return work.

```python
# wait for threads
for t in threads:
	   t.join()
```

Now it can retrieve all the finished work from the `queue` and merge it together.

```python
merged_data = {}

while not workQueue.empty():
	data = workQueue.get()
	for key in data:
		if key not in merged_data:
			merged_data[key] = data[key]
		else:
			merged_data[key] += data[key]
``` 

#### Running the script
Running the script will analyse the set amount of `blocks` back in time from the current `head block`. It will divide the `blocks` over the `amount_of_threads` set and prints out each `thread` and the work this `thread` has to do. During the process each thread updates it's current progress. At the end the merged data is printed.

Test for yourself for different `block_counts` and `amount_of_threads` how much of a difference multi threading yields for this type op work.

```
python multi_threaded.py 1000 8

0 19512130 19512254
1 19512255 19512379
2 19512380 19512504
3 19512505 19512629
4 19512630 19512754
5 19512755 19512879
6 19512880 19513004
7 19513005 19513129
...
Thread 2 is at block 19512445/19512504 51.20%
Thread 0 is at block 19512199/19512254 54.40%
Thread 3 is at block 19512573/19512629 53.60%
Thread 4 is at block 19512694/19512754 50.40%
Thread 7 is at block 19513071/19513129 52.00%
Thread 6 is at block 19512936/19513004 44.00%
Thread 5 is at block 19512822/19512879 52.80%
Thread 1 is at block 19512321/19512379 52.00%
...
'custom_json': 17688, 'claim_reward_balance': 1569, 'vote': 23629, 'comment': 9053, 'transfer_to_vesting': 82, 'comment_options': 1213, 'limit_order_create': 126, 'fill_order': 74, 'return_vesting_delegation': 675, 'producer_reward': 1000, 'curation_reward': 4428, 'author_reward': 1687, 'transfer': 1615, 'comment_benefactor_reward': 335, 'fill_vesting_withdraw': 82, 'account_update': 327, 'account_create_with_delegation': 43, 'delete_comment': 66, 'fill_transfer_from_savings': 6, 'feed_publish': 98, 'account_witness_vote': 38, 'account_witness_proxy': 5, 'transfer_to_savings': 6, 'account_create': 4, 'limit_order_cancel': 29, 'delegate_vesting_shares': 10, 'withdraw_vesting': 17, 'transfer_from_savings': 3, 'cancel_transfer_from_savings': 2, 'witness_update': 1}
```



#### Curriculum
##### Set up:
- [Part 0: How To Install Steem-python, The Official Steem Library For Python](https://utopian.io/utopian-io/@amosbastian/how-to-install-steem-python-the-official-steem-library-for-python)
- [Part 1: How To Configure The Steempy CLI Wallet And Upvote An Article With Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-1-how-to-configure-the-steempy-cli-wallet-and-upvote-an-article-with-steem-python)
##### Filtering
- [Part 2: How To Stream And Filter The Blockchain Using Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-2-how-to-stream-and-filter-the-blockchain-using-steem-python)
- [Part 6: How To Automatically Reply To Mentions Using Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-6-how-to-automatically-reply-to-mentions-using-steem-python)
##### Voting
- [Part 3: Creating A Dynamic Autovoter That Runs 24/7](https://utopian.io/utopian-io/@steempytutorials/part-3-creating-a-dynamic-upvote-bot-that-runs-24-7-first-weekly-challenge-3-steem-prize-pool)
- [Part 4: How To Follow A Voting Trail Using Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-4-how-to-follow-a-voting-trail-using-steem-python)
- [Part 8: How To Create Your Own Upvote Bot Using Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-8-how-to-create-your-own-upvote-bot-using-steem-python)
##### Posting
- [Part 5: Post An Article Directly To The Steem Blockchain And Automatically Buy Upvotes From Upvote Bots](https://utopian.io/utopian-io/@steempytutorials/part-5-post-an-article-directly-to-the-steem-blockchain-and-automatically-buy-upvotes-from-upvote-bots)
- [Part 7: How To Schedule Posts And Manually Upvote Posts For A Variable Voting Weight With Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-7-how-to-schedule-posts-and-manually-upvote-posts-for-a-variable-voting-weight-with-steem-python)
##### Constructing
- [Part 10: Use Urls To Retrieve Post Data And Construct A Dynamic Post With Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-10-use-urls-to-retrieve-post-data-and-construct-a-dynamic-post-with-steem-python)
##### Rewards
- [Part 9: How To Calculate A Post's Total Rewards Using Steem-Python](https://utopian.io/utopian-io/@steempytutorials/how-to-calculate-a-post-s-total-rewards-using-steem-python)
- [Part 12: How To Estimate Curation Rewards Using Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-12-how-to-estimate-curation-rewards)
- [Part 14: How To Estimate All Rewards In Last N Days Using Steem-Python](https://utopian.io/utopian-io/@steempytutorials/how-to-estimate-all-rewards-in-last-n-days-using-steem-python)
- [Part 20: Plotting Account's Total Generated Post Rewards Since Creation](https://steemit.com/utopian-io/@steempytutorials/part-20-plotting-account-s-total-generated-post-rewards-since-creation)
##### Transfers
- [Part 11: How To Build A List Of Transfers And Broadcast These In One Transaction With Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-11-how-to-build-a-list-of-transfers-and-broadcast-these-in-one-transaction-with-steem-python)
- [Part 13: Upvote Posts In Batches Based On Current Voting Power With Steem-Python](https://utopian.io/utopian-io/@steempytutorials/part-13-upvote-posts-in-batches-based-on-current-voting-power-with-steem-python)
##### Analysis
- [Part 15: How To Check If An Account Is Following Back And Retrieve Mutual Followers/Following Between Two Accounts](https://utopian.io/utopian-io/@steempytutorials/part-15-how-to-check-if-an-account-is-following-back-and-retrieve-mutual-followers-following-between-two-accounts)
- [Part 16: How To Analyse A User's Vote History In A Specific Time Period Using Steem-Python](https://steemit.com/utopian-io/@steempytutorials/part-16-how-to-analyse-a-user-s-vote-history-in-a-specific-time-period-using-steem-python)
- [Part 18: How To Analyse An Account's Resteemers Using Steem-Python](https://steemit.com/utopian-io/@steempytutorials/part-18-how-to-analyse-an-account-s-resteemers)
- [Part 19: Analysing The Steem Blockchain From A Custom Block Number For A Custom Block Count](http://utopian.io/utopian-io/@steempytutorials/part-19-analysing-the-steem-blockchain-for-a-custom-block-number-for-a-custom-block-count)
---
The code for this tutorial can be found on [GitHub](https://github.com/amosbastian/steempy-tutorials/tree/master/part_21)!

This tutorial was written by @juliank in conjunction with @amosbastian.


<br /><hr/><em>Posted on <a href="https://utopian.io/utopian-io/@steempytutorials/part-21-use-multi-threading-to-analyse-the-steem-blockchain-in-parallel">Utopian.io -  Rewarding Open Source Contributors</a></em><hr/>
👍  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
properties (23)
authorsteempytutorials
permlinkpart-21-use-multi-threading-to-analyse-the-steem-blockchain-in-parallel
categoryutopian-io
json_metadata{"community":"utopian","app":"utopian/1.0.0","format":"markdown","repository":{"id":84843862,"name":"steem-python","full_name":"steemit/steem-python","html_url":"https://github.com/steemit/steem-python","fork":false,"owner":{"login":"steemit"}},"pullRequests":[],"platform":"github","type":"tutorials","tags":["utopian-io","steemdev","python","programming","tutorial"],"users":["steempytutorials","amosbastian","juliank","amosbastian."],"links":["https://res.cloudinary.com/hpiynhbhq/image/upload/v1515886103/kmzfcpvtzuwhvqhgpyjp.png","https://steemit.com/utopian-io/@steempytutorials/part-19-analysing-the-steem-blockchain-for-a-custom-block-number-for-a-custom-block-count","https://github.com/amosbastian/steempy-tutorials/tree/master/part_21","https://utopian.io/utopian-io/@amosbastian/how-to-install-steem-python-the-official-steem-library-for-python","https://utopian.io/utopian-io/@steempytutorials/part-1-how-to-configure-the-steempy-cli-wallet-and-upvote-an-article-with-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-2-how-to-stream-and-filter-the-blockchain-using-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-6-how-to-automatically-reply-to-mentions-using-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-3-creating-a-dynamic-upvote-bot-that-runs-24-7-first-weekly-challenge-3-steem-prize-pool","https://utopian.io/utopian-io/@steempytutorials/part-4-how-to-follow-a-voting-trail-using-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-8-how-to-create-your-own-upvote-bot-using-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-5-post-an-article-directly-to-the-steem-blockchain-and-automatically-buy-upvotes-from-upvote-bots","https://utopian.io/utopian-io/@steempytutorials/part-7-how-to-schedule-posts-and-manually-upvote-posts-for-a-variable-voting-weight-with-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-10-use-urls-to-retrieve-post-data-and-construct-a-dynamic-post-with-steem-python","https://utopian.io/utopian-io/@steempytutorials/how-to-calculate-a-post-s-total-rewards-using-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-12-how-to-estimate-curation-rewards","https://utopian.io/utopian-io/@steempytutorials/how-to-estimate-all-rewards-in-last-n-days-using-steem-python","https://steemit.com/utopian-io/@steempytutorials/part-20-plotting-account-s-total-generated-post-rewards-since-creation","https://utopian.io/utopian-io/@steempytutorials/part-11-how-to-build-a-list-of-transfers-and-broadcast-these-in-one-transaction-with-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-13-upvote-posts-in-batches-based-on-current-voting-power-with-steem-python","https://utopian.io/utopian-io/@steempytutorials/part-15-how-to-check-if-an-account-is-following-back-and-retrieve-mutual-followers-following-between-two-accounts","https://steemit.com/utopian-io/@steempytutorials/part-16-how-to-analyse-a-user-s-vote-history-in-a-specific-time-period-using-steem-python","https://steemit.com/utopian-io/@steempytutorials/part-18-how-to-analyse-an-account-s-resteemers","http://utopian.io/utopian-io/@steempytutorials/part-19-analysing-the-steem-blockchain-for-a-custom-block-number-for-a-custom-block-count"],"image":["https://res.cloudinary.com/hpiynhbhq/image/upload/v1515886103/kmzfcpvtzuwhvqhgpyjp.png"],"moderator":{"account":"roj","time":"2018-02-03T01:25:02.728Z","reviewed":true,"pending":false,"flagged":false}}
created2018-02-02 10:12:45
last_update2018-02-03 01:25:03
depth0
children5
last_payout2018-02-09 10:12:45
cashout_time1969-12-31 23:59:59
total_payout_value13.329 HBD
curator_payout_value3.645 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length11,864
author_reputation31,094,047,689,691
root_title"Part 21: Use Multi Threading To Analyse The Steem Blockchain In Parallel "
beneficiaries
0.
accountutopian.pay
weight2,500
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id34,364,165
net_rshares3,359,295,358,459
author_curate_reward""
vote details (36)
@r351574nc3 · (edited)
$0.08
I think here multithreading is confused with parallelism and concurrency. They're not the same. Concurrency guarantees two processes are happening simultaneously while multithreading does not. Multi-core processors offer the possibility that two threads **can** be processed independently and intel's Hyperthreading can make multithreading on the same core really close to multiprocessing or concurrency, it is not. When two threads are processed by the same core, this is done serially and not in parallel.

### In a Nutshell

`parallelization > multithreading`
👍  , ,
properties (23)
authorr351574nc3
permlinkre-steempytutorials-part-21-use-multi-threading-to-analyse-the-steem-blockchain-in-parallel-20180719t165712779z
categoryutopian-io
json_metadata{"tags":["utopian-io"],"app":"steemit/0.1"}
created2018-07-19 16:57:12
last_update2018-07-19 16:58:54
depth1
children0
last_payout2018-07-26 16:57:12
cashout_time1969-12-31 23:59:59
total_payout_value0.062 HBD
curator_payout_value0.018 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length562
author_reputation169,747,269,306,049
root_title"Part 21: Use Multi Threading To Analyse The Steem Blockchain In Parallel "
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id65,265,583
net_rshares39,128,179,157
author_curate_reward""
vote details (3)
@roj ·
Thank you for the contribution. It has been approved.

You can contact us on [Discord](https://discord.gg/uTyJkNm).
**[[utopian-moderator]](https://utopian.io/moderators)**
properties (22)
authorroj
permlinkre-steempytutorials-part-21-use-multi-threading-to-analyse-the-steem-blockchain-in-parallel-20180203t012518080z
categoryutopian-io
json_metadata{"tags":["utopian-io"],"community":"utopian","app":"utopian/1.0.0"}
created2018-02-03 01:25:18
last_update2018-02-03 01:25:18
depth1
children0
last_payout2018-02-10 01:25:18
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length172
author_reputation12,636,295,215,793
root_title"Part 21: Use Multi Threading To Analyse The Steem Blockchain In Parallel "
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id34,528,822
net_rshares0
@setianyareza ·
Good tutorial, i want try this tutorial, thank
properties (22)
authorsetianyareza
permlinkre-steempytutorials-part-21-use-multi-threading-to-analyse-the-steem-blockchain-in-parallel-20180202t124910829z
categoryutopian-io
json_metadata{"tags":["utopian-io"],"app":"steemit/0.1"}
created2018-02-02 12:49:21
last_update2018-02-02 12:49:21
depth1
children0
last_payout2018-02-09 12:49:21
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length46
author_reputation11,771,963,526
root_title"Part 21: Use Multi Threading To Analyse The Steem Blockchain In Parallel "
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id34,392,574
net_rshares0
@usmantohasbi27 ·
I like u, but i think so your info after comflik
properties (22)
authorusmantohasbi27
permlinkre-steempytutorials-part-21-use-multi-threading-to-analyse-the-steem-blockchain-in-parallel-20180203t135930129z
categoryutopian-io
json_metadata{"tags":["utopian-io"],"app":"steemit/0.1"}
created2018-02-03 13:59:36
last_update2018-02-03 13:59:36
depth1
children0
last_payout2018-02-10 13:59:36
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length48
author_reputation51,195,252,163
root_title"Part 21: Use Multi Threading To Analyse The Steem Blockchain In Parallel "
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id34,650,180
net_rshares0
@utopian-io ·
### Hey @steempytutorials I am @utopian-io. I have just upvoted you!
#### Achievements
- Seems like you contribute quite often. AMAZING!
#### Suggestions
- Contribute more often to get higher and higher rewards. I wish to see you often!
- Work on your followers to increase the votes/rewards. I follow what humans do and my vote is mainly based on that. Good luck!
#### Get Noticed!
- Did you know project owners can manually vote with their own voting power or by voting power delegated to their projects? Ask the project owner to review your contributions!
#### Community-Driven Witness!
I am the first and only Steem Community-Driven Witness. <a href="https://discord.gg/zTrEMqB">Participate on Discord</a>. Lets GROW TOGETHER!
- <a href="https://v2.steemconnect.com/sign/account-witness-vote?witness=utopian-io&approve=1">Vote for my Witness With SteemConnect</a>
- <a href="https://v2.steemconnect.com/sign/account-witness-proxy?proxy=utopian-io&approve=1">Proxy vote to Utopian Witness with SteemConnect</a>
- Or vote/proxy on <a href="https://steemit.com/~witnesses">Steemit Witnesses</a>

[![mooncryption-utopian-witness-gif](https://steemitimages.com/DQmYPUuQRptAqNBCQRwQjKWAqWU3zJkL3RXVUtEKVury8up/mooncryption-s-utopian-io-witness-gif.gif)](https://steemit.com/~witnesses)

**Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x**
properties (22)
authorutopian-io
permlinkre-steempytutorials-part-21-use-multi-threading-to-analyse-the-steem-blockchain-in-parallel-20180203t142800570z
categoryutopian-io
json_metadata{"tags":["utopian-io"],"community":"utopian","app":"utopian/1.0.0"}
created2018-02-03 14:28:00
last_update2018-02-03 14:28:00
depth1
children0
last_payout2018-02-10 14:28:00
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length1,436
author_reputation152,955,367,999,756
root_title"Part 21: Use Multi Threading To Analyse The Steem Blockchain In Parallel "
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id34,656,184
net_rshares0