create account

How The Hive Fork Manager manage The Hive micro-forks by mickiewicz

View this thread on: hive.blogpeakd.comecency.com
· @mickiewicz ·
$58.70
How The Hive Fork Manager manage The Hive micro-forks
![hfm_rewind_title.png](https://images.hive.blog/DQmdCyzcVauDmaQVRDMYVuiYJZj7c4oXjNwqUG3w9CiM2mb/hfm_rewind_title.png)

There are situations when The Hive blockchain has to choose which of two valid but different blocks and their descendants should be included in the chain,  and which will be abandoned. The consensus algorithm solves the micro-fork problem, one queue of blocks will be abandoned and not visible in the blockchain, but what with the applications which already have built their state based on the abandoned blocks? The applications must solve this not trivial problem on their own, fortunately, The HAF introduced a common,  generic solution for the applications based on the Postgres database - automatically restore the SQL tables to state before the micro-fork occurrence.

>If You don't know what is The Hive Application Framework please read ["What is HAF"](https://hive.blog/hive-139531/@mickiewicz/what-is-haf)

##  How to rewind changes in SQL tables if they are made based on abandoned blocks?
The answer is simple: remember each change and revert them, if they are based on abandoned blocks. Let's move through an example:
We have a protocol ( described by JSONs  in CUSTOM JSON transactions ) which have three instructions:
1. **insert** a number with a given value to a table
2. **increment** numbers with a given value by one
3. **remove** numbers with a given value

Our application has a simple table NUMBERS with two columns: unique id and a number ( create the table with SQL: ```CREATE TABLE numbers(in SERIAL UNIQUE, number INTEGER``` )

Now we are getting blocks one by one with our protocol commands encoded in CUSTOM JSON transactions and content of the table after issuing the requested  commands:

| block number  |   command    | value        | 
|--------------------: | ----------------:   |  ----------:   | 
| 100                      | INSERT           |   (1, 33)     | 
| 101                      | INCREMENT |   (1, 34)     |
| 102                      | REMOVE        |   (1, 34)     | 

### How does Hive Fork Manger remember changes in tables?
Each table that is under the control of Hive Fork Manager has its shadow - a shadow table. A shadow table contains information about the changes for each row of its original version. We can distinguish only three operations that encompass any possible actions on any SQL tables content: we can ***create*** a new row, we can ***update*** already inserted row or we can ***delete*** the row. It means we don't have to bother with commands specified by the protocols (like in example insert, increment, remove), all cases can always be a presenter as the 3 basic edition operations. For the example we got the content of the shadow table:

| Change id | block number  |   command    | old value    | 
| -------------: |----------------------|:----------------:  |  -------:   |
|         1          | 100                      | CREATE         |   doesn't mater |
| 2                  | 101                      | UPDATE         |   (1, 33) |
| 3                  | 102                      | DELETE         |   (1,34)  |

Ok, but how to match saved changes with a row in the original table? In our example, it is quite easy, since each row has its unique ID, but this is not the case for every table. After a series of experiments, there was a decision to require any controlled table to inherit from some special table - a context base table,  which each application got during registration in The Hive Fork Manager. A context base table introduces a new column **hive_rowid** for each table inheriting for it, the value of this column is saved together with changes in a shadow table. Moreover, it is important to have a saved blocks number, for which the change occurred, any shadow table has a column with a block num. So for our example, the shadow table looks as below:

| Change id   | hive_rowid | block number  |   command    | old value           | 
| ---------------: | ---------------: |----------------------|:----------------:  |  -------:                 |
|  1                   |  1                    | 100                      | CREATE         |  doesn't mater |
|  2                   |   1                   | 101                      | UPDATE         |   (1, 33) |
|  3                   |   1                   | 102                      | DELETE         |    (1,34) |

### Ok, we have a shadow table, and now what?
If the Hive node will notify about abandoned blocks, then we can execute an opposite operation for each operation saved in a shadow table in reverse order of their occurrences.
For our example, let's imagine that the hive node informs the Hive Fork Manager about abandoning blocks 100, 101, and 102, in such case we execute operations in the order like below:
```
1. INSERT INTO numbers VALUES(1, 1, 34); -- oposite to last saved operation DELETE, insert row with id 1
2. UPDATE number SET id=1, number=33 WHERE hive_rowid=1; -- oposite to UPDATE, restore old values in a row 1
3. DELETE FROM numbers WHERE hive_rowid = 1; -- opposite to saved INSERT operation
```
**Tada!** All changes on table **numbers** made based on abandoned blocks 100, 101, and 102 are reverted! You can check the implementation [here](https://gitlab.syncad.com/hive/haf/-/blob/develop/src/hive_fork_manager/context_rewind/back_from_fork.sql), the function `hive.back_from_fork_one_table` is the entry point to the algorithm.

### Hmm, there is possible to do this much faster!
Instead of reverting each change in blocks one by one, it is better to back to the row state before the first abandoned block. In our example, it means simply removing the row with id 1 ( only 1 operation instead of 3 )! Indeed this algorithm is very fast, moreover, it was implemented in Hive Fork Manager, but because of some constraints in Postgres implementation it was replaced with a slower version with [commit](https://gitlab.syncad.com/hive/haf/-/commit/2e48fadd337c0667e3bce503c04e7a6a16fb33cb)

#### Why do we use a slower version of the rewind algorithm?
Because of constraints, SQL constraints can be applied on the SQL tables, for example: UNIQUE. If we made a small modification to the table definition in our example:
```CREATE TABLE numbers(in SERIAL UNIQUE, number INTEGER UNIQUE```
Now the numbers cannot repeat in the table. We can imagine situations when back to a state of rows may temporary violation the UNIQUE constraint during the rewind process, and the whole process will fail, even when at its end the constraint will be achieved:

| block number  |   command    | values        | 
|--------------------: | ----------------:   |  ----------:   | 
| 100                      | INSERT           |   (1, 33)     | 
| 101                      | INCREMENT |   (1, 34)     |
| 102                      | INSERT        |   (1,34),(2, 33)     | 
We got a shadow table:
| Change id   | hive_rowid | block number  |   command    | old value           | 
| ---------------: | ---------------: |----------------------|:----------------:  |  -------:                 |
|  1                   |  1                    | 100                      | INSERT         |  doesn't mater  but it is (1,33) |
|  2                   |   1                   | 101                      | UPDATE         |   (1, 33) |
|  3                   |   2                   | 102                      | INSERT         |    doesn't mater but it is (2,33) |
Let's imagine that blocks 101 and 102 become abandoned and we run the fastest rewind algorithm and the first row with hive rowid = 1 is reverted what means update its value back to  (1,33) -> It violates UNIQUE constraint for a value, now row 1 and 2 have value 33 and the SQL command will fail( there is a [test](https://gitlab.syncad.com/hive/haf/-/blob/develop/tests/integration/functional/hive_fork_manager/context_rewind/back_from_fork_constraint_unique_test.sql) to check if HFM will correctly work in such a situation). The fastest algorithm omitted constraints by using the Postgres feature ['SET CONSTRAINTS'](https://www.postgresql.org/docs/current/sql-set-constraints.html), which forces evaluate constraints during committing a transaction, not immediately during modifying a table. We have resigned from the fastest rewind algorithm because a constraint must be deferrable to be evaluated at the committing time what is inconvenient and disallow to use such a constraint to resolve 'ON CONFLICT' SQL construction what is described [here in the database documentation](https://www.postgresql.org/docs/12/sql-createtable.html):
>    Note that deferrable constraints cannot be used as conflict arbitrators in an INSERT statement that includes an ON CONFLICT DO UPDATE clause.

#### The fastest rewind algorithm was really fast
I have checked my notes to remind myself how much faster was the fastest algorithm against the current one, here are the results of measurements for 10k of rows:
| Test                        | THE FASTEST [ms]        |  CURRENT  [ms]          | CURRENT/THE FASTEST [-]|
| :------------------------   | :-----------------------     | :---------------------------    | :-------------------  |
| Back from insert 10k rows   | 23, 24, 23         **[23.3]**| 124, 124, 125        **[124.3]**|   5.33                |
| Back from delete 10k rows   | 36, 35, 36         **[35.6]**| 174, 169, 170          **[171]**|   4.80                |
| Back from update 10k rows   | 48, 48, 48           **[48]**| 238, 245, 239        **[240.7]**|   5.01                |
| Back from truncate 10k rows | 32, 31, 32         **[35.6]**| 166, 173, 166        **[168.3]**|   4.72                |

It means that the current algorithm is five times slower than the fastest! But we decided that the current speed is good enough to efficiently rewind changes by The Hive Fork Manager. If You are interested in what was really tested here are the performance tests for rewind [insert](https://gitlab.syncad.com/hive/haf/-/blob/develop/tests/integration/functional/hive_fork_manager/context_rewind/performance_insert_rows_one_by_one_test.sql),[delete](https://gitlab.syncad.com/hive/haf/-/blob/develop/tests/integration/functional/hive_fork_manager/context_rewind/performance_back_from_delete_rows_test.sql), [update](https://gitlab.syncad.com/hive/haf/-/blob/develop/tests/integration/functional/hive_fork_manager/context_rewind/performance_back_from_update_rows_test.sql) and [truncate](https://gitlab.syncad.com/hive/haf/-/blob/develop/tests/integration/functional/hive_fork_manager/context_rewind/performance_truncate_rows_test.sql) rows.
The Hive Fork Manger still suffers for the requirement for deferrable constraints, but only for FOREIGN KEY constraints. In one moment only one table can be rewind, and it may violate those constraints which are set between two tables. Narrowing the requirement only for FOREIGN KEY is less problematic for the applications than demanding all the constraints to be deferrable.

## How the shadow table is filled?
[The tables triggers](https://www.postgresql.org/docs/current/sql-createtrigger.html) are used to fill shadow tables. If a table needs to rewind its content in the case of micro-fork, then it must be registered in The Hive Fork Manager. During registration, the table got its shadow and a set of triggers is enabled on the table. The triggers are sensitive for the table content modification, and each change fires a procedure that fills the shadow table with a new row. You can check what exactly happened during a table registration by looking at the function [hive.register_table]( https://gitlab.syncad.com/hive/haf/-/blob/develop/src/hive_fork_manager/context_rewind/register_table.sql). The applications don't execute the registration function directly, they need to add inheritance from its contexts tables, which will add `hive_rowid` column and automatically start  `hive.register_table` function.

### Triggers are very slow
The triggers add significant overhead to operations on the tables. The overhead doesn't matter when the applications work on live sync, which means each new block is processed every three seconds what is a lot of time, but when the blockchain is replayed and a large number of irreversible blocks have to be processed immediately one by one then the triggers overhead is not acceptable. The Hive Fork Manager returns to the application the number of irreversible blocks to process, and the application may temporarily pull its tables from HFM care with function 'hive.app_context_detach'. The triggers are removed and the applications may process irreversible blocks much faster. After finishing processing the irreversible block the application back its table under HFM control with `hive.app_context_attach` function.

## Will a shadow table grow forever together with each change to its origin table?
Each shadow table is truncated when the Hive node informs The Hive Fork Manager about considering a new block as an irreversible block. All information saved in shadow tables for irreversible blocks are removed, and thus the shadow tables contain only rows for blocks that are near the HEAD BLOCK.

## That's all for today
I hope the post may help applications developers to understand why the HAF API looks as it looks, and how the performance of the application may be hitten by the internals of the Hive Fork Manager. There is no explanation of how the HFM knows on which block the change saved in a shadow table occurs, I will explain this in the next post about how the HFM passes blocks to applications for further processing.

If You want to start to write Your first HAF application, then please look at my post about the Hive Fork Manager [documentation](https://hive.blog/hive-139531/@mickiewicz/when-you-want-to-write-a-haf-application-then-first-read-the-hive-fork-manager-documentation).
👍  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
properties (23)
authormickiewicz
permlinkhow-the-hive-fork-manager-manage-the-hive-micro-forks
categoryhive-139531
json_metadata{"tags":["hive-139531","dev","haf","hive","hfm"],"image":["https://images.hive.blog/DQmdCyzcVauDmaQVRDMYVuiYJZj7c4oXjNwqUG3w9CiM2mb/hfm_rewind_title.png"],"links":["https://hive.blog/hive-139531/@mickiewicz/what-is-haf"],"app":"hiveblog/0.1","format":"markdown"}
created2021-11-21 18:21:15
last_update2021-11-21 18:21:15
depth0
children2
last_payout2021-11-28 18:21:15
cashout_time1969-12-31 23:59:59
total_payout_value29.369 HBD
curator_payout_value29.330 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length13,839
author_reputation20,569,771,333,459
root_title"How The Hive Fork Manager manage The Hive micro-forks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd0
post_id107,904,039
net_rshares20,000,587,964,473
author_curate_reward""
vote details (35)
@hivebuzz ·
Congratulations @mickiewicz! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s):

<table><tr><td><img src="https://images.hive.blog/60x70/http://hivebuzz.me/@mickiewicz/upvoted.png?202111221153"></td><td>You received more than 1000 upvotes.<br>Your next target is to reach 1250 upvotes.</td></tr>
</table>

<sub>_You can view your badges on [your board](https://hivebuzz.me/@mickiewicz) and compare yourself to others in the [Ranking](https://hivebuzz.me/ranking)_</sub>
<sub>_If you no longer want to receive notifications, reply to this comment with the word_ `STOP`</sub>



**Check out the last post from @hivebuzz:**
<table><tr><td><a href="/hivebuzz/@hivebuzz/pum-202111-18"><img src="https://images.hive.blog/64x128/https://i.imgur.com/FO3kVHG.png"></a></td><td><a href="/hivebuzz/@hivebuzz/pum-202111-18">Hive Power Up Month - Feedback from Day 18</a></td></tr></table>
properties (22)
authorhivebuzz
permlinkhivebuzz-notify-mickiewicz-20211122t122210
categoryhive-139531
json_metadata{"image":["http://hivebuzz.me/notify.t6.png"]}
created2021-11-22 12:22:12
last_update2021-11-22 12:22:12
depth1
children0
last_payout2021-11-29 12:22:12
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length935
author_reputation370,317,975,808,390
root_title"How The Hive Fork Manager manage The Hive micro-forks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id107,919,854
net_rshares0
@mickiewicz · (edited)
$0.05
Here is my personal proposition of  The Hive Fork Manger icon :)
![hive_fork_manager.png](https://images.hive.blog/DQmPtYBJ1f1Der9SLSDKBgNFKLU2LZVESMQMknVJg7gznRe/hive_fork_manager.png)
👍  
properties (23)
authormickiewicz
permlinkr2xx8o
categoryhive-139531
json_metadata{"image":["https://images.hive.blog/DQmPtYBJ1f1Der9SLSDKBgNFKLU2LZVESMQMknVJg7gznRe/hive_fork_manager.png"],"app":"hiveblog/0.1"}
created2021-11-21 21:05:15
last_update2021-11-23 10:55:57
depth1
children0
last_payout2021-11-28 21:05:15
cashout_time1969-12-31 23:59:59
total_payout_value0.028 HBD
curator_payout_value0.026 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length185
author_reputation20,569,771,333,459
root_title"How The Hive Fork Manager manage The Hive micro-forks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd0
post_id107,906,936
net_rshares19,172,864,870
author_curate_reward""
vote details (1)