create account

A Hacky Guide to Hive (part 2.2.1: blocks) by felixxx

View this thread on: hive.blogpeakd.comecency.com
· @felixxx · (edited)
$21.87
A Hacky Guide to Hive (part 2.2.1: blocks)
## Context

In the [previous post](/@felixxx/a-hacky-guide-to-hive-part-21-yo-broadcast), I made a [special transaction](https://hivehub.dev/tx/eb025cf797ee5bc81d7399282268079cc29cc66d).
I broadcasted a ``custom_json`` transaction of the ``type``: ``YO``.
This information will forever be stored in block **89040473** of Hive's blockchain.
To **get to** this information again, I could querry a Hive node's:

- [block_api.get_block](https://developers.hive.io/apidefinitions/#block_api.get_block), by blocknumber 
- [transaction_status_api.find_transaction](https://developers.hive.io/apidefinitions/#transaction_status_api.find_transaction), by transaction ID

If I don't know those 2 parameters, but want to find **my** move, I could use:

- [account_history_api.get_account_history](https://developers.hive.io/apidefinitions/#account_history_api.get_account_history), by account name...

...you can access blockchain data many different ways, use the above enpoints with Beem or [lighthive](https://github.com/emre/lighthive)...

I demonstrated, how anyone can YO, now I want to show a method, to get to all YOs.
It could be any custom_json. Or a different event. It's just an example. It could be a _move_ in a blockchain game, or you could go as far as trying to build your own little hive engine.
You might want to observe votes or comments as they come in, and store some, so you don't have to look them up again later, maybe for a notification system...

___
![FELIXBOXVHSTAPE400YO.png](https://files.peakd.com/file/peakd-hive/felixxx/23u5SmjF6Ya2X8TzxdWqXjc4Q93v812HtAGqFF29S5VR3EFBFPvRV1apReyuY5XPXx8Vd.png)
___
## A Better Stream

In [another post](/@felixxx/a-hacky-guide-to-hive-part-15) I explained, how the Hive blockchain is really just a very long list.

### ``block_api``

The block_api gives you access to all blocks.
You can access the block_api on all public nodes.
If you want to use [your own node](@gtg/witness-update-release-candidate-for-eclipse-is-out), having only the block_api should be one of the cheapest options.

### ``stream()``

Basically you could build most things around _just looking at all blocks as they are written_.
That will not include all information for everything (virtual values and such), but a lot.
This might not be the best approach to build everything, but once you've got a stable block stream going, you can build good stuff around it...

### Beem

[Beem](https://github.com/holgern/beem/tree/master)'s stream() method still works and you could use it as is.

The main logic behind Beem's stream is hidden in the [blocks() mehod](https://github.com/holgern/beem/blob/master/beem/blockchain.py#L394). That part alone is 278 lines long and does a lot of things. 
In the background, Beem can handle:

- node switching
- threading
- syncing
- private keys

... and more.
I could not build it better. I don't have to.

### Procedure

The main procedure to get to a block is still just a querry.
The speed and reliability of that querry depends mostly on the source (the node), not on the Python code. 

[Python isn't particularily fast](https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/python3-gpp.html) to begin with.
But all we need it to do during this procedure:
- Querry _next_ block
- Filter the block for YO
- Store YO

That's a job done.

At the moment, querrying the latest block from api.hive.blog takes about 1 second.
Maximum block size is a [witness parameter](https://github.com/openhive-network/hive/blob/master/doc/witness_parameters.md#maximum_block_size):
> The value must not be more than 2MB (2097152).

...so there are 2 seconds left to handle 2MB at most. (current max: 65536 Bytes)
To just filter and store a block takes only miliseconds even in Python...
Which means, this thing can idle for almost 2 seconds and repeat the procedure.

Beem actually [does that too](https://github.com/holgern/beem/blob/master/beem/blockchain.py#L572) ๐Ÿ˜…:
```
# Sleep for one block
time.sleep(self.block_interval)
```
### Storage

It doesn't really matter how I build the stream; without storage, I'll lose all progress when the stream ends or crashes.

I'll use SQL. I could use Redis, or Mongo...

There are many different storage solutions and I could never build anything better.
This stuff handles sessions and serialization. It comes with built in backup solutions.
It's fast. It's scalable: I'll use SQLite, but you could plugin in a giant cluster of whatever.
I am trying to move the responsibility of storage handling where it belongs: the database level.

### threading and node switching

Beem can switch through nodes from a list and even manage worker threads.
But why manage that inside Python in the first place?

I will just build one single procedure and can run it as a background service.
If I need another thread, I can just run another instance of the same procedure.
I could run one thread for every node, or even use separate machines.
Anyhow, the procedure does not need to know which thread it's in.
As long as I funnel the data to the same database in the end, all synchronization and serialization and whatnot is taken care of automatically.

I am trying to move the responsibility of concurrency where it belongs: the operating system- and database layer.
___
## Live Stream
### ``block_api.get_block_range``
```
import requests

def get_block_range(start, count, url):
	data = '{"jsonrpc":"2.0", "method":"block_api.get_block_range","params":{"starting_block_num":'+str(start)+',"count": '+str(count)+'},"id":1}'       
	response = requests.post(url=url, data=data)
	return response.json()['result']['blocks']
```
The only function you really need.
I am not even joking.
- Usage:

```
url = 'https://api.hive.blog'

for block in get_block_range(89040473, 1, url):
	print(block)
```
### Loop

For a stream you only need to loop this; you need a _start_ block and then increment.
Repeat every 3 seconds and it's basically Beem's stream(), without all the fluff.

But that's an infinite loop.
For the final service, that's what I'd want; For a code snippet, I feel like avoiding it.

In the early days, nodes accepted websockets. I don't know, why that got turned off. Maybe it was too expensive. Maybe you can still do something like that on your own node. 
Anyways, if you test this on the public nodes you are stuck with this 3-second-querry loop. It seems crude, but it seems as that's how it's done.

[The documentation recommends Beem's stream](https://developers.hive.io/tutorials-python/stream_blockchain_transactions.html).

@jesta's chainsync [does it](https://github.com/aaroncox/chainsync/blob/master/chainsync/chainsync.py#L190): 
```
time.sleep(self.get_approx_sleep_until_block(throttle, config, status['time']))
```
So yeah... I also wait 3 seconds.

### Interrupt

Best case would be, I start the loop once and it runs infinitly (fire&forget).
In reality I have to prepare for what happens should it stop.
Maybe I need to resync the whole service...

The above is all it takes to rebuild Beem's stream or any other.
Wrap some try excepts around it and it can't really break down.

But for something useful, storage is necessary.
So that I at least know, where the last tream stopped. And where to begin...
For YO, I could ignore all 89040473 blocks before the first YO.

### Traffic

That 3-second-querry thing may seem like a lot of traffic.
But if it's planned well, and stored well, it only has to be done once for any block.
From that point on, it can feed a whole network of other things, which don't have to make any queries outside of my own database.

Again: For things like posts and author balance, the standard apis can be enough.
Also: Posts, votes, account balance, can **change**, blocks can't.

Sending one request every 3 seconds, receiving 60KB max data...
I don't know, how annyoing this is for node providers.
I guess it's ok...

Syncing might be different. In the docs, there's a get_block_range example with count=1000.
The response could be 60MB. But that could also sync 50 minutes in a single call...

### Filter YO

```
def get_yos(block):
    yos = []
    for transaction in block['transactions']:
        for operation in transaction['operations']:            
            if operation['type'] == 'custom_json_operation':
                if operation['value']['id'] == 'YO':
                    yos.append(operation)
    return yos
```
Returns all YOs in a block, but loses the information, which transaction and which block each YO was in.

I'll try to avoid data manipulation in this part of the service.
This part is the stream and shouldn't be involved in anything else.
However, I do want to store the _block num_, which already got lost along the way.
I also want _block id_ and _previous_. This just demonstrates how to filter data.
It's best to start by building the tables first, though.

## Conclusion

It might not look like much, but the part that **needs** to connect to a Hive node is done.
This is the absolute minimum necessary and can only fail at very few points so far.
Most possible problems can be caught outside of this core logic.
All that's missing is persistent storage, which I will conclude next post.

Anyways, threading, concurrency, data manipulation, whatever... everything else can and should happen later, upstream.
What I keep trying to point out: All extra logic should be avoided.
I am looking at a Hive querry as a single step - a procedure. It should be a single function.
Next post, storage will be wrapped in as few procedures as possible and that will conclude in a YO crawler/watcher that feeds a db, that you could plug **anything** into. It will probably be short and include only minimal logic. That's a feature.

### Naming

I think, the hardest question in programming is naming.
'YO crawler' isn't good. I should give this thing a name, before it's finished. 
custom_jacksn, or custom_YOson maybe? Or YOmind...
๐Ÿ‘  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and 190 others
properties (23)
authorfelixxx
permlinka-hacky-guide-to-hive-part-221-blocks
categorydev
json_metadata"{"app":"peakd/2024.8.7","description":"Best use of blocks_api, stream, YO","format":"markdown","image":["https://files.peakd.com/file/peakd-hive/felixxx/23u5SmjF6Ya2X8TzxdWqXjc4Q93v812HtAGqFF29S5VR3EFBFPvRV1apReyuY5XPXx8Vd.png"],"tags":["dev","hive-dev","hivedev","hive"],"users":["felixxx","gtg","jesta"]}"
created2024-09-20 04:27:27
last_update2024-09-20 05:20:24
depth0
children10
last_payout2024-09-27 04:27:27
cashout_time1969-12-31 23:59:59
total_payout_value10.949 HBD
curator_payout_value10.922 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length9,906
author_reputation216,289,007,274,068
root_title"A Hacky Guide to Hive (part 2.2.1: blocks)"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd0
post_id137,213,431
net_rshares72,246,248,884,308
author_curate_reward""
vote details (254)
@arc7icwolf ·
>custom_YOson

I vote for this one ๐Ÿคฃ

Btw, these posts are very interesting - *also because they help me realize how my scripts are even worse than what I thought* !LOL - and, at the same, quite hard to understand for me.

There's plenty of informations and I have to find some free time to start doing some tests and exercises, as I've found that this is what helps me the most in understanding the most difficult stuff.
properties (22)
authorarc7icwolf
permlinkre-felixxx-sk3u7k
categorydev
json_metadata{"tags":["dev"],"app":"peakd/2024.8.7"}
created2024-09-20 09:26:09
last_update2024-09-20 09:26:09
depth1
children7
last_payout2024-09-27 09:26:09
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length421
author_reputation503,681,646,868,197
root_title"A Hacky Guide to Hive (part 2.2.1: blocks)"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id137,216,654
net_rshares0
@felixxx ·
>hard to understand

I think it may be a bit hard to understand, because it doesn't do anything, yet.
I need 1 more post to finish custom_YOson. 
Then one more post to show what it can be used for.

I hope it all makes more sense then.
It's very few lines of code...
properties (22)
authorfelixxx
permlinkre-arc7icwolf-sk3yk0
categorydev
json_metadata{"tags":["dev"],"app":"peakd/2024.8.7"}
created2024-09-20 11:00:03
last_update2024-09-20 11:00:03
depth2
children5
last_payout2024-09-27 11:00:03
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length266
author_reputation216,289,007,274,068
root_title"A Hacky Guide to Hive (part 2.2.1: blocks)"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id137,218,349
net_rshares0
@arc7icwolf ·
There are also a lot of concepts that I'm not familiar with, but by reading about them at least I'm starting to get a wider idea of how they are all interconnected.
properties (22)
authorarc7icwolf
permlinkre-felixxx-sk5hfi
categorydev
json_metadata{"tags":["dev"],"app":"peakd/2024.8.7"}
created2024-09-21 06:45:21
last_update2024-09-21 06:45:21
depth3
children4
last_payout2024-09-28 06:45:21
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length164
author_reputation503,681,646,868,197
root_title"A Hacky Guide to Hive (part 2.2.1: blocks)"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id137,267,882
net_rshares0
@lolzbot ·
<div class='pull-right'><center><img src="https://lolztoken.com/lolz.png"><br><a href="https://lolztoken.com">lolztoken.com</a></p><br><br><br><br></center></div><p><center><strong>My review for the worldโ€™s strongest tape<br>Itโ€™s not tearable.</strong><br><sub>Credit: <a href="https://peakd.com/@reddit">reddit</a></sub><br>@felixxx, I sent you an <a href="https://lolztoken.com">$LOLZ</a> on behalf of arc7icwolf<br><br>(1/10)<br>Delegate Hive Tokens to Farm $LOLZ and earn 110% Rewards.  <a href='https://peakd.com/@lolztoken/introducing-lolz-defi-now-you'>Learn more.</a></center></p>
properties (22)
authorlolzbot
permlinkre-re-felixxx-sk3u7k-20240920t093007z
categorydev
json_metadata"{"app": "beem/0.24.19"}"
created2024-09-20 09:30:15
last_update2024-09-20 09:30:15
depth2
children0
last_payout2024-09-27 09:30:15
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length588
author_reputation196,135,106,716,989
root_title"A Hacky Guide to Hive (part 2.2.1: blocks)"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id137,216,685
net_rshares0
@sorin.cristescu ·
Is @holger80 still around ? I think he left, right ? Is anyone still maintaining the Beem library ?
properties (22)
authorsorin.cristescu
permlinkre-felixxx-2024922t153652385z
categorydev
json_metadata{"content_type":"general","type":"comment","tags":["dev","hive-dev","hivedev","hive"],"app":"ecency/3.1.6-mobile","format":"markdown+html"}
created2024-09-22 13:36:51
last_update2024-09-22 13:36:51
depth1
children1
last_payout2024-09-29 13:36:51
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length99
author_reputation255,754,000,681,122
root_title"A Hacky Guide to Hive (part 2.2.1: blocks)"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id137,316,952
net_rshares0
@felixxx ·
$0.05
afaik @holger80 is still doing stuff, but not around Hive.
I finished the [next part](/@felixxx/a-hacky-guide-to-hive-part-222-customyoson) of this guide today.
What I am trying to demonstrate is that in many cases don't need a library and are better off without one. 

๐Ÿ‘  
properties (23)
authorfelixxx
permlinkre-sorincristescu-sk7vdu
categorydev
json_metadata{"tags":["dev"],"app":"peakd/2024.8.7"}
created2024-09-22 13:41:57
last_update2024-09-22 13:41:57
depth2
children0
last_payout2024-09-29 13:41:57
cashout_time1969-12-31 23:59:59
total_payout_value0.024 HBD
curator_payout_value0.023 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length270
author_reputation216,289,007,274,068
root_title"A Hacky Guide to Hive (part 2.2.1: blocks)"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id137,317,019
net_rshares151,657,639,040
author_curate_reward""
vote details (1)