create account

RE: Presenting FracPack - A Better Zero-copy Binary Serialization Format for C++ and Rust by dan

View this thread on: hive.blogpeakd.comecency.com

Viewing a response to: @inertia/re-dan-s4o2xp

· @dan ·
Forward and backward compatibility is largely a function of making as many fields as possible in your data structures "optional".  This is the approach used by every format I have seen. The cost of making something optional is the extra data you need to encode to indicate the presence of the optional fields.

The smallest way to indicate an optional field is a single bit, but if a field is present then you need to know where to find it which requires an offset (4 bytes). There is no extra overhead for making dynamically sized types optional in FracPack because the struct's heap offset pointer can signal non presence in the pointer itself. Likewise, empty strings and vectors have no extra overhead and use the same amount of data as optional.

For speed of zero-copy access to fields, it is best to know a constant offset to the field you are looking for rather than having to first read a variable length bitfield. 

So the tradeoff we made is that null optional fields use 4 bytes each, all 0, unless all fields after it are also null in which case we can truncate the size of the fixed-region and each non-present trailing optional field uses 0 bytes.

If data-size is an issue, then a fast zero-compression algorithm like Cap'n'Proto uses can all but eliminate the overhead; however, the act of compressing/decompressing can slow down the first access and potential security issues with data decompression bombs.

So for the sake of speed and constant time access to fields via pre-computed offsets, optional data has an overhead of 4 bytes. This overhead can be mitigated if you know you will never want to remove fields from version 1 of your data types and you always include or exclude all (or most) of your future extended fields.
properties (22)
authordan
permlinks4p079
categoryblog
json_metadata{"app":"hiveblog/0.1"}
created2023-11-25 18:47:33
last_update2023-11-25 18:47:33
depth2
children1
last_payout2023-12-02 18:47:33
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length1,747
author_reputation155,470,101,136,708
root_title"Presenting FracPack - A Better Zero-copy Binary Serialization Format for C++ and Rust"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id129,163,653
net_rshares0
@inertia ·
I’ve noticed in your past blockchain projects, like what became Hive, the use of an ‘extensions’ array at the end of definitions, which provided flexibility for future enhancements, such as the addition of ‘beneficiaries’ in the ‘comment’ object. Has this experience influenced your approach to data structure design in FracPack? Is that what you're referring to by "extended fields," here?
properties (22)
authorinertia
permlinkre-dan-s4p0uf
categoryblog
json_metadata{"tags":["blog"],"app":"peakd/2023.11.2"}
created2023-11-25 19:01:27
last_update2023-11-25 19:01:27
depth3
children0
last_payout2023-12-02 19:01:27
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length390
author_reputation346,568,901,399,561
root_title"Presenting FracPack - A Better Zero-copy Binary Serialization Format for C++ and Rust"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id129,163,887
net_rshares0