A Hacky Guide to Hive (part 2.2.1: blocks) by felixxx

dev · @felixxx · Sep 20 '24 (edited)

$21.87

A Hacky Guide to Hive (part 2.2.1: blocks)

## Context

In the [previous post](/@felixxx/a-hacky-guide-to-hive-part-21-yo-broadcast), I made a [special transaction](https://hivehub.dev/tx/eb025cf797ee5bc81d7399282268079cc29cc66d).
I broadcasted a ``custom_json`` transaction of the ``type``: ``YO``.
This information will forever be stored in block **89040473** of Hive's blockchain.
To **get to** this information again, I could querry a Hive node's:

- [block_api.get_block](https://developers.hive.io/apidefinitions/#block_api.get_block), by blocknumber 
- [transaction_status_api.find_transaction](https://developers.hive.io/apidefinitions/#transaction_status_api.find_transaction), by transaction ID

If I don't know those 2 parameters, but want to find **my** move, I could use:

- [account_history_api.get_account_history](https://developers.hive.io/apidefinitions/#account_history_api.get_account_history), by account name...

...you can access blockchain data many different ways, use the above enpoints with Beem or [lighthive](https://github.com/emre/lighthive)...

I demonstrated, how anyone can YO, now I want to show a method, to get to all YOs.
It could be any custom_json. Or a different event. It's just an example. It could be a _move_ in a blockchain game, or you could go as far as trying to build your own little hive engine.
You might want to observe votes or comments as they come in, and store some, so you don't have to look them up again later, maybe for a notification system...

___
![FELIXBOXVHSTAPE400YO.png](https://files.peakd.com/file/peakd-hive/felixxx/23u5SmjF6Ya2X8TzxdWqXjc4Q93v812HtAGqFF29S5VR3EFBFPvRV1apReyuY5XPXx8Vd.png)
___
## A Better Stream

In [another post](/@felixxx/a-hacky-guide-to-hive-part-15) I explained, how the Hive blockchain is really just a very long list.

### ``block_api``

The block_api gives you access to all blocks.
You can access the block_api on all public nodes.
If you want to use [your own node](@gtg/witness-update-release-candidate-for-eclipse-is-out), having only the block_api should be one of the cheapest options.

### ``stream()``

Basically you could build most things around _just looking at all blocks as they are written_.
That will not include all information for everything (virtual values and such), but a lot.
This might not be the best approach to build everything, but once you've got a stable block stream going, you can build good stuff around it...

### Beem

[Beem](https://github.com/holgern/beem/tree/master)'s stream() method still works and you could use it as is.

The main logic behind Beem's stream is hidden in the [blocks() mehod](https://github.com/holgern/beem/blob/master/beem/blockchain.py#L394). That part alone is 278 lines long and does a lot of things. 
In the background, Beem can handle:

- node switching
- threading
- syncing
- private keys

... and more.
I could not build it better. I don't have to.

### Procedure

The main procedure to get to a block is still just a querry.
The speed and reliability of that querry depends mostly on the source (the node), not on the Python code. 

[Python isn't particularily fast](https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/python3-gpp.html) to begin with.
But all we need it to do during this procedure:
- Querry _next_ block
- Filter the block for YO
- Store YO

That's a job done.

At the moment, querrying the latest block from api.hive.blog takes about 1 second.
Maximum block size is a [witness parameter](https://github.com/openhive-network/hive/blob/master/doc/witness_parameters.md#maximum_block_size):
> The value must not be more than 2MB (2097152).

...so there are 2 seconds left to handle 2MB at most. (current max: 65536 Bytes)
To just filter and store a block takes only miliseconds even in Python...
Which means, this thing can idle for almost 2 seconds and repeat the procedure.

Beem actually [does that too](https://github.com/holgern/beem/blob/master/beem/blockchain.py#L572) 😅:
```
# Sleep for one block
time.sleep(self.block_interval)
```
### Storage

It doesn't really matter how I build the stream; without storage, I'll lose all progress when the stream ends or crashes.

I'll use SQL. I could use Redis, or Mongo...

There are many different storage solutions and I could never build anything better.
This stuff handles sessions and serialization. It comes with built in backup solutions.
It's fast. It's scalable: I'll use SQLite, but you could plugin in a giant cluster of whatever.
I am trying to move the responsibility of storage handling where it belongs: the database level.

### threading and node switching

Beem can switch through nodes from a list and even manage worker threads.
But why manage that inside Python in the first place?

I will just build one single procedure and can run it as a background service.
If I need another thread, I can just run another instance of the same procedure.
I could run one thread for every node, or even use separate machines.
Anyhow, the procedure does not need to know which thread it's in.
As long as I funnel the data to the same database in the end, all synchronization and serialization and whatnot is taken care of automatically.

I am trying to move the responsibility of concurrency where it belongs: the operating system- and database layer.
___
## Live Stream
### ``block_api.get_block_range``
```
import requests

def get_block_range(start, count, url):
	data = '{"jsonrpc":"2.0", "method":"block_api.get_block_range","params":{"starting_block_num":'+str(start)+',"count": '+str(count)+'},"id":1}'       
	response = requests.post(url=url, data=data)
	return response.json()['result']['blocks']
```
The only function you really need.
I am not even joking.
- Usage:

```
url = 'https://api.hive.blog'

for block in get_block_range(89040473, 1, url):
	print(block)
```
### Loop

For a stream you only need to loop this; you need a _start_ block and then increment.
Repeat every 3 seconds and it's basically Beem's stream(), without all the fluff.

But that's an infinite loop.
For the final service, that's what I'd want; For a code snippet, I feel like avoiding it.

In the early days, nodes accepted websockets. I don't know, why that got turned off. Maybe it was too expensive. Maybe you can still do something like that on your own node. 
Anyways, if you test this on the public nodes you are stuck with this 3-second-querry loop. It seems crude, but it seems as that's how it's done.

[The documentation recommends Beem's stream](https://developers.hive.io/tutorials-python/stream_blockchain_transactions.html).

@jesta's chainsync [does it](https://github.com/aaroncox/chainsync/blob/master/chainsync/chainsync.py#L190): 
```
time.sleep(self.get_approx_sleep_until_block(throttle, config, status['time']))
```
So yeah... I also wait 3 seconds.

### Interrupt

Best case would be, I start the loop once and it runs infinitly (fire&forget).
In reality I have to prepare for what happens should it stop.
Maybe I need to resync the whole service...

The above is all it takes to rebuild Beem's stream or any other.
Wrap some try excepts around it and it can't really break down.

But for something useful, storage is necessary.
So that I at least know, where the last tream stopped. And where to begin...
For YO, I could ignore all 89040473 blocks before the first YO.

### Traffic

That 3-second-querry thing may seem like a lot of traffic.
But if it's planned well, and stored well, it only has to be done once for any block.
From that point on, it can feed a whole network of other things, which don't have to make any queries outside of my own database.

Again: For things like posts and author balance, the standard apis can be enough.
Also: Posts, votes, account balance, can **change**, blocks can't.

Sending one request every 3 seconds, receiving 60KB max data...
I don't know, how annyoing this is for node providers.
I guess it's ok...

Syncing might be different. In the docs, there's a get_block_range example with count=1000.
The response could be 60MB. But that could also sync 50 minutes in a single call...

### Filter YO

```
def get_yos(block):
    yos = []
    for transaction in block['transactions']:
        for operation in transaction['operations']:            
            if operation['type'] == 'custom_json_operation':
                if operation['value']['id'] == 'YO':
                    yos.append(operation)
    return yos
```
Returns all YOs in a block, but loses the information, which transaction and which block each YO was in.

I'll try to avoid data manipulation in this part of the service.
This part is the stream and shouldn't be involved in anything else.
However, I do want to store the _block num_, which already got lost along the way.
I also want _block id_ and _previous_. This just demonstrates how to filter data.
It's best to start by building the tables first, though.

## Conclusion

It might not look like much, but the part that **needs** to connect to a Hive node is done.
This is the absolute minimum necessary and can only fail at very few points so far.
Most possible problems can be caught outside of this core logic.
All that's missing is persistent storage, which I will conclude next post.

Anyways, threading, concurrency, data manipulation, whatever... everything else can and should happen later, upstream.
What I keep trying to point out: All extra logic should be avoided.
I am looking at a Hive querry as a single step - a procedure. It should be a single function.
Next post, storage will be wrapped in as few procedures as possible and that will conclude in a YO crawler/watcher that feeds a db, that you could plug **anything** into. It will probably be short and include only minimal logic. That's a feature.

### Naming

I think, the hardest question in programming is naming.
'YO crawler' isn't good. I should give this thing a name, before it's finished. 
custom_jacksn, or custom_YOson maybe? Or YOmind...

👍 gtg, shaka, trostparadox, fw206, emrebeyler, empoderat, solominer, bleujay, deathwing, ph1102, kvinna, dalz, craftink, mary-me, holdonla, hiq.magazine, ace108, quekery, smooms, piotrgrafik, josediccus, andyjaypowell, jeffjagoe, uwelang, thevil, shanibeer, pollux.one, gadrian, hiq, tonyz, kiel91, sorin.cristescu, danielsaori, incublus, pishio, bluerobo, ecency, antisocialist, chrislybear, mein-senf-dazu, xels, etblink, bechibenner, darkflame, pundito, ynwa.andree, penguinpablo, blue.rabbit, kizumo, someguy123, miketr, leosoph, kheldar1982, arc7icwolf, oadissin, palomap3, netaterra, smartvote, hiq.witness, jagoe, lu1sa, danielhuhservice, alexvan, hykss.leo, and 190 others

`author`	felixxx
`permlink`	a-hacky-guide-to-hive-part-221-blocks
`category`	dev
`json_metadata`	"{"app":"peakd/2024.8.7","description":"Best use of blocks_api, stream, YO","format":"markdown","image":["https://files.peakd.com/file/peakd-hive/felixxx/23u5SmjF6Ya2X8TzxdWqXjc4Q93v812HtAGqFF29S5VR3EFBFPvRV1apReyuY5XPXx8Vd.png"],"tags":["dev","hive-dev","hivedev","hive"],"users":["felixxx","gtg","jesta"]}"
`created`	2024-09-20 04:27:27
`last_update`	2024-09-20 05:20:24
`depth`	0
`children`	10
`last_payout`	2024-09-27 04:27:27
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	10.949 HBD
`curator_payout_value`	10.922 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	9,906
`author_reputation`	216,289,007,274,068
`root_title`	"A Hacky Guide to Hive (part 2.2.1: blocks)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	0
`post_id`	137,213,431
`net_rshares`	72,246,248,884,308
`author_curate_reward`	""

properties (23)vote details (254)

voter	rshares	pct
leprechaun	6,335,165,871	32.5%
juanmiguelsalas	7,514,467,465	10%
gtg	10,740,941,752,407	25%
good-karma	12,277,443,635	1%
jeffjagoe	873,943,037,288	100%
ace108	1,053,049,784,670	25%
shaka	7,914,410,438,684	100%
avellana	20,591,960,092	20%
miketr	197,112,743,631	60%
erikkartmen	3,291,917,031	100%
netaterra	163,940,563,444	15%
someguy123	215,352,232,478	50%
foxkoit	16,147,573,019	100%
pollux.one	694,208,992,949	80%
darkflame	273,036,141,647	100%
penguinpablo	227,618,333,317	14%
uwelang	788,225,208,098	50%
phusionphil	13,356,722,562	95%
funnyman	1,461,663,955	5.6%
bleujay	2,251,360,828,642	100%
techslut	83,939,360,845	12.5%
esteemapp	2,788,044,729	1%
justinw	38,644,167,755	8.25%
alexvan	105,177,226,212	100%
danielsaori	513,423,357,077	100%
freebornsociety	5,497,549,658	10%
andyjaypowell	892,646,553,334	100%
kilianmiguel	2,171,373,117	10%
xels	342,798,779,418	50%
tonyz	631,061,132,200	70%
leontr	2,532,546,814	40%
alphacore	7,164,977,170	7.12%
joeyarnoldvn	462,141,878	1.47%
tegoshei	1,354,117,185	5%
shanibeer	705,416,111,585	35%
etblink	325,121,267,276	50%
musicgeek	1,027,168,128	50%
marketinggeek	723,218,885	100%
sportschain	926,591,090	50%
cconn	12,267,489,005	100%
sorin.cristescu	522,231,395,109	100%
deathwing	2,060,196,441,170	50%
esteem.app	330,687,944	1%
josediccus	949,859,654,736	44%
der-prophet	41,419,297,355	16.5%
quekery	1,024,871,903,336	100%
emrebeyler	3,614,988,277,444	35%
docmarenkristina	790,897,589	50%
mytechtrail	16,972,785,262	15%
smooms	993,476,972,733	100%
cryptonized	235,750,785	14%
abeba	789,631,646	12.5%
franciscana23	9,242,202,017	100%
bengy	17,362,510,394	25%
piotrgrafik	956,733,157,877	80%
investyourvote	40,725,469,028	60%
franciscomarval	4,579,058,774	25%
minerspost	2,244,403,094	50%
iddaa	1,082,727,379	25%
sudefteri	36,542,724,912	50%
green77	16,643,722,674	80%
darkpylon	2,647,078,114	50%
pundito	257,475,600,418	100%
movement19	1,745,770,100	6.25%
akifane	1,433,656,551	50%
chrislybear	378,380,904,001	100%
jagoe	123,006,558,961	100%
oadissin	179,985,507,350	50%
satren	42,658,817,368	30%
antisocialist	404,458,381,937	50%
rivalzzz	88,369,690,532	100%
gadrian	664,393,196,698	50%
mary-me	1,077,162,421,029	100%
marijo-rm	25,411,360,139	25%
solominer	2,364,208,632,469	25%
fw206	4,069,540,093,848	50%
kahvesizlik	1,251,923,799	100%
radiosteemit	4,127,228,506	25%
dalz	1,575,605,130,183	100%
smartvote	158,209,998,931	6.2%
platuro	640,662,376	100%
altonos	3,294,730,692	100%
ynwa.andree	256,639,028,467	100%
lagitana	1,018,935,292	20%
thevil	763,787,813,252	100%
berthold	10,370,031,469	40%
janettyanez	1,380,078,879	25%
captain.future	4,914,656,576	100%
kizumo	218,122,115,432	100%
kiel91	544,346,222,929	100%
bluerobo	449,874,064,080	100%
currentxchange	13,491,452,541	25%
jacuzzi	613,754,126	1.4%
leosoph	193,171,595,838	100%
holdonla	1,059,442,561,144	100%
hungrybear	619,764,502	14%
sophieandhenrik	10,867,753,709	50%
kggymlife	2,392,678,996	12.5%
guysellars	2,367,761,592	100%
blue.rabbit	227,256,494,944	100%
steemvpn	33,517,048,644	95%
ph1102	1,796,718,901,816	60%
kittykate	101,257,378,847	100%
imbartley	785,616,166	25%
kaeptn-iglo	4,211,811,922	100%
investinthefutur	88,691,586,122	60%
empoderat	2,481,212,156,060	100%
whangster79	3,990,180,987	25%
dpoll.witness	897,403,779	35%
narnary	544,515,429	88.88%
therealyme	12,114,513,916	2.25%
bilpcoin.pay	543,711,146	10%
tobago	625,906,210	35%
an-sich-wachsen	4,791,030,287	33%
lotto-de	89,348,991,713	60%
rcaine	8,192,468,157	6%
pavelsku	16,413,994,194	12.5%
iceledy	838,997,723	100%
nanyuris	1,818,900,729	100%
bilpcoinbpc	1,133,390,016	5%
danielhuhservice	109,572,311,299	33%
zelegations	12,477,867,610	95%
radiohive	6,147,525,918	25%
blue-witness	1,173,907,872	100%
hiq	648,520,239,089	100%
timhorton	12,102,973,423	95%
cerberus-dji	2,787,331,525	95%
ecency	439,569,619,858	1%
mangowambo	289,860,757	100%
kvfm	532,514,979	12.5%
hivecannabis	10,655,251,045	95%
laradio	1,059,372,613	25%
patronpass	8,121,275,467	50%
cronicasdelcesar	5,169,444,627	100%
ecency.stats	371,446,041	1%
carmenm20	755,360,088	25%
recoveryinc	8,223,655,050	12.5%
dying	917,057,302	25%
arrrds	556,510,225	25%
hive-vpn	2,507,371,100	95%
cleydimar2000	2,496,841,242	12.5%
radiolovers	4,960,763,375	25%
ciresophen	4,780,326,393	100%
prometheus1881	1,009,890,717	100%
trcommunity	0	50%
carmate	11,755,703,271	100%
patriamcaritatis	7,177,356,725	100%
rslsaku	3,321,440,795	100%
damus-nostra	27,405,758,707	100%
alberto0607	9,633,782,036	25%
abenteurer-dan	5,822,903,436	100%
hiq.redaktion	78,554,848,728	100%
bea23	5,224,863,593	25%
netaterra.leo	973,763,444	13.5%
yayogerardo	1,332,851,391	50%
ciudadcreativa	889,075,961	20%
hykss.leo	101,677,840,456	10%
samrisso	8,852,020,440	12.5%
trostparadox	4,150,416,090,857	100%
coinomite	1,160,306,537	100%
ausbit.dev	11,637,893,336	50%
xyba	44,498,872,097	100%
vaipraonde	32,732,427,290	100%
tomtothetom	3,863,706,037	25%
princeofbeyhive	1,413,171,605	50%
drricksanchez	42,055,358,279	7.5%
mein-senf-dazu	344,423,644,473	100%
madsbert	827,873,561	89%
hive.friends	859,950,220	50%
hjrrodriguez	60,336,719,733	100%
oldsoulnewb	14,373,184,989	100%
lxsxl	30,376,329,145	50%
meesterbrain	586,052,134	32.5%
yenmendt	11,601,418,973	25%
aguamiel	9,397,111,612	25%
fehlerbeheber	679,814,666	100%
t-nil	1,841,065,070	30%
dungeondog	67,231,323,368	100%
pishio	450,591,360,845	10%
r0nny	35,155,427,390	100%
noempathy	8,700,592,671	100%
menzo	511,053,282	2.5%
bilgin70	23,202,566,279	25%
ebike-adventure	2,140,882,934	100%
ronymaffi	3,049,206,974	25%
lucianaabrao	1,240,627,816	100%
elephantium	1,530,754,162	50%
acantoni	3,470,426,265	12.5%
snaqz	2,480,430,484	100%
eolianpariah2	1,751,278,835	0.5%
arc7icwolf	181,827,108,054	100%
sevatar	814,149,543	100%
hiq.magazine	1,058,705,363,580	100%
relf87	72,419,762,191	100%
bluedevil0722	1,375,892,507	50%
susurrodmisterio	884,490,690	12.5%
hoffmeister84	7,458,004,092	50%
adventkalender	2,397,111,647	50%
kvinna	1,605,386,671,437	89%
heteroclite	15,181,445,961	25%
passenger777	46,974,886,196	25%
patchwork	3,093,522,200	50%
mikezillo	10,033,052,665	100%
palomap3	163,949,115,998	50%
ridwanms	8,970,181,435	100%
johndieo	9,627,122,821	100%
kheldar1982	181,990,924,962	100%
mypathtofire	62,962,143,175	50%
us3incanada	1,698,289,677	100%
dusunenkalpp	25,182,952,383	50%
ipexito	894,488,578	40%
serpent7776	399,614,101	100%
actifit-chris	16,507,177,065	100%
bechibenner	294,329,519,952	100%
visionarystudios	9,621,580,997	100%
visualblock	63,951,696,691	25%
hiq.shares	10,412,445,033	100%
mukadder	71,336,861,690	35%
badge-428571	1,397,172,451	100%
mugglow	473,533,267	100%
incublus	457,447,227,885	50%
ezgicop	7,177,024,737	50%
apeofwallst	1,884,674,820	25%
hiq.witness	151,420,111,003	100%
slicense	1,944,368,720	60%
daudmuhammad2022	778,081,010	100%
megstarbies	1,700,118,235	100%
awildovasquez	36,146,944,801	100%
duskobgd	90,771,723,497	100%
hive-195880	1,041,953,062	25%
naters	482,428,388	100%
hive-coding	1,253,762,729	100%
craftink	1,087,301,873,669	100%
actifitgirl	4,623,187,005	100%
lu1sa	119,670,051,321	100%
chinay04	55,263,736,489	100%
hive-shop	1,741,587,639	100%
hivegadgets	2,144,580,305	50%
mizzmomoz	2,137,007,503	100%
e-sport-gamer	2,610,778,847	33%
hive-bounty	2,263,350,127	89%
hive-learn.more	975,699,279	100%
thezyppi	25,467,151,868	89%
hivetoolsio	869,814,526	89%
hpud.wettbewerb	4,438,629,498	100%
e-sport-girly	2,083,282,465	33%
hivetycoon	1,634,492,222	100%
jkatrina	4,337,958,657	50%
bellscoin	1,691,312,035	100%
learn2code	1,114,349,515	50%
lolz.byte	0	100%
hive-188753	1,218,714,214	50%
hive-world-champ	36,030,433,988	100%
porqpin	1,835,917,556	70%

`author`	lolzbot
`permlink`	re-re-felixxx-sk3u7k-20240920t093007z
`category`	dev
`json_metadata`	"{"app": "beem/0.24.19"}"
`created`	2024-09-20 09:30:15
`last_update`	2024-09-20 09:30:15
`depth`	2
`children`	0
`last_payout`	2024-09-27 09:30:15
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 HBD
`curator_payout_value`	0.000 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	588
`author_reputation`	196,135,106,716,989
`root_title`	"A Hacky Guide to Hive (part 2.2.1: blocks)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	137,216,685
`net_rshares`	0