/
statejournal.py
347 lines (286 loc) · 13.3 KB
/
statejournal.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
from ethereum.utils import sha3
from ethereum.utils import big_endian_to_int, int_to_big_endian, zpad
import rlp
import os
"""
Efficient journal based cryptographically authenticated data structure
as an alternative to the Merkel Patricia Tree
The basic idea is to update a hash function with the stream of state updates,
which are written to a journal, instead of using a merkel tree.
The current state is directly mapped to a key value store.
While the Merkel Patricia Tree is good at supporting access to states in different chains,
it has runtime, memory and storage issues.
Potential Improvements:
- x50 lower storage requirements (x20 if keeping the full journal, x1000 if deleting storage)
- x12/x17 faster for reads/writes
- x135/x70 lower system IO for reads/writes
- x6 lower memory footprint (levedb needs to cache key prefixes)
- supports pruning of old data
- straight forward implementation (direct k,v mapping + journal)
The storage improvements are due to
- less data (no merkel tree levels)
- compressable keys, b/c keys to account storage location can share the same prefix
Limitation:
- Rollbacks to unexpectedly old state (i.e. to uncached state) are expensive
Design Goals:
- minimal footprint for the current state
- fast read / write access
- pruning old data (deleted state, blocks, txs, accounts) should be possible
- suitable for DHT based long term storage of old data
- fast reconstruction of the current state (syncing) w/o going through the vm
Base assumptions:
- in a practical BC system state is final after N blocks
- thus most of the old state is irrelevant
- hashes are non compressable and slow, use them sparsely
- checking current state and computing new states happens frequently and should be fast
- querying recent states (chain branch switch) happens frequently and should be fast
- querying old states happens infrequently (SSV)
Implementation:
- split current state and historical states
- current state is only the current data at addresses
- historical state is stored in a local journal or a DHT
- old blocks, txs, receipts is historical state
- use a hash digest to track state updates H(H'|H(update))
- keys, values are directly mapped to leveldb
- an update counter (stored next to the value) is used to reference older states
- recover old states by
- using in memory snapshots (for up to N blocks)
- backward reading the journal
- all state changes are written to a journal
- to support log(n) SSVs
the journal merges in state_digests at tx and block boundaries (see below)
Changes to the current Ethereum protocol:
- block header:
- change: state_root is now the state_digest
- change: tx_list_root becomes H(H(tx0), ... H(txN))
- change: receipts_root becomes H(H(R0), ... H(RN))
- add: update_counter
- add: state_digest of a second prev block (so we get a tree structure for log(n) SSVs)
referenced block number is the highest divisor of the current block number
for divisors in [2^0, 2^1, ... 2^n]
- tx receipts:
- change: medstate is now the state_digest
- add: update_counter
Changes to the Ethereum implementation:
- Changes to block header and tx receipt
- StateJournal replaces Trie
- blocks that are not yet considered to be in the final chain
need to cache their state updates in memory (or a dedicated StateJournal)
Simplified State Verification (SSV):
A light client wants a SSV for a (value, update_counter) tuple.
Client asks the network for a SSV, receives the shortest chain of state_digests from the
last value change to the current state_digest (stored in the latest block).
The recursively hashed state_digests must match the current state_digest.
A light client wants to know the current state of an account property:
Client asks the network with a key corresponding to the account property
for (value, update_counter). At least one answer must be honest.
Client asks the network for a SSV.
A light client wants to download/check a transaction/block_header.
Roughly the same as above but with the transaction/block_header hash as the key.
A light client wants to know an old state of an account property at `target update_counter`:
Get the current state which contains the `previous update_counter`.
The returned `previous update_counter` references the update_counter for the state change
at that key previous to the last update.
While `previous update_counter` > `target update_counter`:
Ask the network for the journal entry at `previous update_counter`
Light clients want to collectively validate a block or watch updates
Download and SSV the necessary data.
DHT based StateJournal:
update_counter => [state_digest, key, value, old_counter]
For every entry to be stored a SSV is necessary.
Only updates from blocks which are considered final should be added to the DHT.
"""
b32 = 2**32
b16 = 2**16
class StateJournal(object):
state_journal_fn = 'state_journal'
state_journal_index_fn = 'state_journal.idx'
empty_state_digest = sha3('')
"""
Updates to the state are tracked by state_digest updates
Previous states are referenced by a state_counter
The StateJournal only tracks one chain, i.e. does not keep track of forks.
A StateJournal (continuing from the previous one) can be created for every block and deleted
once the block is considered to be in the final chain.
SSVs are supported by
- providing a (value, update_counter) tuple and the current state_digest
- traversing the journal up to the current state_digest
Rollbacks are supported by
- reading the log backwards and restoring the old values
- non final states should better be kept in a chain of in memory State Journals
Datastructure:
Key Value Store (the state db):
mapping(key : rlp[value, update_counter])
note: `key` can be of arbitrary size
Journal Log:
state_digest[32] | rlp[key, value, old_counter] | log_size[2]
Journal Index:
journal_pos_ptr[4]
i.e post log pos position is at (update_counter-1) * 4
"""
def __init__(self, db):
self.journal = open(os.path.join(db.dbfile, self.state_journal_fn), 'a')
self.journal_index = open(os.path.join(db.dbfile, self.state_journal_index_fn), 'a')
self.db = db
l = JournalReader(db).last_update()
if l:
self.state_digest = l['state_digest']
self.update_counter = l['update_counter']
else:
self.state_digest = self.empty_state_digest
self.update_counter = 0
print 'uc/state', self.update_counter, self.state_digest.encode('hex')
def get_raw(self, key):
"returns (value, update_counter)"
try:
v = self.db.get(key)
val, counter = rlp.decode(v)
counter = big_endian_to_int(counter)
return val, counter
except KeyError:
return b'', 0
def get(self, key):
"returns value"
return self.get_raw(key)[0]
def update(self, key, value):
"""
- increases the update counter
- retrieves the the old_update_counter for the key
- stores the value in leveldb
- generates a log: rlp[key, value, old_update_counter]
- computes the new state_digest as: H(last_state_digest, H(log))
- adds to the journal: state_digest | log | journal_entry_length
- updates index with the postion of the end of above journal_entry
"""
self.update_counter += 1
old_value, old_counter = self.get_raw(key)
# store in leveldb
if value:
_stored_value = rlp.encode([value, self.update_counter])
self.db.put(key, _stored_value)
else:
self.db.delete(key)
# generate log
log = rlp.encode([key, value, old_counter])
# update state
self.state_digest = sha3(self.state_digest + sha3(log))
# state_digest | [key, value, old_counter] | journal_entry_length
self.journal.write(self.state_digest)
self.journal.write(log)
journal_entry_length = 32 + len(log) + 2
assert journal_entry_length < b16, journal_entry_length
self.journal.write(zpad(int_to_big_endian(journal_entry_length), 2)) # 2 bytes
# write index
pos = self.journal.tell()
assert pos < b32
idx = zpad(int_to_big_endian(pos), 4) # 4 bytes
self.journal_index.write(idx)
# debug
# self.commit()
# jr = JournalReader(self.db)
# r = jr.read_update(self.update_counter)
# assert r['value'] == value
# print r
def commit(self):
self.journal_index.flush()
self.journal.flush()
self.db.commit()
def delete(self, key):
"actually deletes the key from the database"
self.update(key, '')
def rollback(self, update_counter, verify=False):
"""
rollback to the state after update_counter based on the local journal
In practice this file based rollback should not be used,
but instead updates for young blocks which are probably not final yet
should be held in memory
"""
# read log backwards
jr = JournalReader(self.db)
for uc in reversed(range(update_counter + 1, self.update_counter+1)):
u = jr.read_update(uc)
key = u['key']
# update with old value
prev_uc = u['prev_update_counter']
if prev_uc > 0:
v = jr.read_update(prev_uc)['value']
self.db.put(key, v)
else:
self.db.delete(key)
# read state before the update we reverted
if uc > 1:
state_digest = jr.read_update(uc-1)['state_digest']
assert sha3(state_digest, u['log_hash']) == u['state_digest']
self.state_digest = state_digest
else:
self.state_digest = self.empty_state_digest
# truncate the logfile and index
self.journal_index.seek((update_counter - 1) * 4)
log_end_pos = big_endian_to_int(self.journal_index.read(4))
self.journal_index.truncate()
self.journal.seek(log_end_pos)
self.journal.truncate()
EOF = 2
class JournalReader(object):
"""
"""
def __init__(self, db):
self.journal = open(os.path.join(db.dbfile, StateJournal.state_journal_fn), 'r')
self.journal_index = open(os.path.join(db.dbfile, StateJournal.state_journal_index_fn),
'r')
def update_counter(self):
self.journal_index.seek(0, EOF)
return self.journal_index.tell() / 4
def last_update(self):
uc = self.update_counter()
if uc == 0:
return {}
return self.read_update(uc)
def read_update(self, update_counter):
"first update has update_counter=1"
# assert update_counter > 0
self.journal_index.seek((update_counter - 1) * 4)
log_end_pos = big_endian_to_int(self.journal_index.read(4))
self.journal.seek(log_end_pos - 2)
log_len = big_endian_to_int(self.journal.read(2))
self.journal.seek(-log_len, 1)
state_digest = self.journal.read(32) # state_digest after change
log = self.journal.read(-32 + log_len - 2)
key, value, prev_update_counter = rlp.decode(log)
prev_update_counter = big_endian_to_int(prev_update_counter)
return dict(key=key, value=value, prev_update_counter=prev_update_counter,
state_digest=state_digest, log_hash=sha3(log), update_counter=update_counter)
def validate_state(self, last_update_counter):
state_digest = StateJournal.empty_state_digest
for i in range(1, last_update_counter+1):
l = self.read_update(i)
assert sha3(state_digest + l['log_hash']) == l['state_digest']
state_digest = l['state_digest']
return state_digest
def get_ssv(self, update_counter_start):
"""
returns all hashes from a given value up to the current state.
recursively hasing them up should lead to the current state root.
note: the user first needs to know or query and trust
- the current state root
- the current value,update_counter
- i.e. at least one queried peer needs to give a honest answer
PoC implementation is O(n), but can be changed to O(log(n)) by
- adding state_digests to txs and (tree like) for blocks
"""
# read the update
r = self.read_update(update_counter_start)
if update_counter_start == 1:
prev_state_digest = StateJournal.empty_state_digest
else:
prev_state_digest = self.read_update(update_counter_start - 1)['state_digest']
r['hash_chain'] = [prev_state_digest, r['log_hash']]
update_counter = update_counter_start + 1
while True:
try:
u = self.read_update(update_counter)
except IOError:
break
r['hash_chain'].append(u['log_hash'])
update_counter += 1
return r