def accum2_a(xs, ys, size): temp = 0 for j in unroll(range(size)): temp += xs[size - j - 1] * ys[j] for j in unroll(range(size)): temp += xs[j] * ys[j] return temp
def sum0(data): sum = [0] * 16 for i in unroll(range(8)): sum[i] = data[i] + data[i + 8] for i in unroll(range(4)): sum[i] += sum[i + 4] for i in unroll(range(2)): sum[i] += sum[i + 2] return sum[0] + sum[1]
def unroll16(start, stop): sum = 0 for i in unroll(range(4)): sum += i for i in range(10): sum += i for i in unroll(range(4)): sum += i return sum
def nn_xor32_8(x:bit32)->bit8: rv_xor8:bit8 = 0 for wi in unroll(range(len(DATA_W))): tmp_x:bit6 = count_bit32(x ^ DATA_W[wi]) rv_xor8 <<= 1 rv_xor8 |= 0 if tmp_x < DATA_th[wi] else 1 return rv_xor8
def count_bit32(x:bit32)->bit6: mask:bit32 = 0xF shift_n = 0 sum = 0 for i in unroll(range(8)): sum += xbit4_to_n[((x & mask) >> shift_n)] mask <<= 4 shift_n += 4 return sum
def pipelined_unroll01(xs, ys): s = 0 for i in pipelined(unroll(range(8), 2)): x = xs[i] + 1 if x < 0: s = s + x else: s = s - x ys[i] = x #print(x) return s
def unroll01(xs, ys): s = 0 for i in unroll(range(8)): x = xs[i] + 1 if x < 0: s = s + x else: s = s - x ys[i] = x #print(x) return s
def fir_filter_streaming(input: int, coeff: list, previous: list) -> int: N = TAPS - 1 for j in unroll(range(N)): jj = N - j previous[jj] = previous[jj - 1] #print(jj) previous[0] = input if previous[N] == 0: return 0 else: temp = 0 for j in pipelined(range(TAPS)): temp += previous[N - j] * coeff[j] return temp
def do_mnist7_mem(a:List[bit16], _mem:List[bit16], lst_len = LEN): rom_w = W_PARAM rom_b = B_PARAM mem = [0] * 10 xi = 0 for i in range(lst_len): x = a[i] for j in unroll(range(10)): mem[j] = bfloat.mul_add(x, rom_w[xi + j], mem[j]) xi += 10 for j in range(10): _mem[j] = bfloat.add(mem[j], rom_b[j])
def unroll05_c(xs): sum = 0 for k in range(4, 4): for i in unroll(range(4, 4)): sum += xs[i] return sum
def unroll07_a(xs: list): sum = 0 for i in unroll(range(0, 4, 2)): sum += xs[i] return sum
def unroll07_c(xs: list): sum = 0 for i in unroll(range(1, 12, 2)): sum += xs[i] return sum
def unroll01(x): sum = 0 for i in unroll(range(4), 2): for j in range(4): sum += (i * j * x) return sum
def unroll15_a(start, stop): sum = 0 for i in unroll(range(start, stop), 4): sum += i return sum
def unroll10_a(xs:list): sum = 0 for i in unroll(range(0, 10, 2), 4): sum += xs[i] return sum
def unroll14_b(stop): sum = 0 for i in unroll(range(stop), 5): sum += i return sum
def unroll06_b(xs: list): sum = 0 for i in unroll(range(2, 10), 4): sum += xs[i] return sum
def unroll17(): sum = 0 for i in unroll(range(-10, 10)): sum += f(i) return sum
def unroll04(): u = unroll([1, 2, 3], 4)
def sum1(data): sum = 0 for i in unroll(range(16)): sum += data[i] return sum
def unroll02_a(): xs = [10, 20, 30, 40] sum = 0 for x in unroll(xs): sum += x return sum
def unroll09_a(xs: list): sum = 0 for i in unroll(range(10), 4): sum += xs[i] return sum
def unroll05_a(xs): sum = 0 for i in unroll(range(2, 4)): sum += xs[i] return sum
def unroll06_a(xs: list): sum = 0 for i in unroll(range(2, 4), 2): sum += xs[i] return sum
def shift_full(xs: list, x): for j in unroll(range(len(xs) - 1)): jj = (len(xs) - 1) - j xs[jj] = xs[jj - 1] xs[0] = x
def process_sha256(self): work = [0] * 64 # type: List[bit32] _h = [0] * 8 # type: List[bit32] __h = [0] * 8 # type: List[bit32] while is_worker_running(): update = True for i in range(8): _h[i] = h[i] block_len512: bit512 = self.data_in.rd() block_len32 = block_len512 count = 0 #print(block_len512) #print(block_len32) while count < block_len32: #print(count, block_len32) count += 1 #print("--=========") d512 = self.data_in.rd() #print("start d512 %5t", d512, "$time") shift_n = 480 for i in unroll(range(16)): work[i] = (d512 >> shift_n) & 0xFFFFFFFF shift_n -= 32 for i in range(16, 64): wi_15 = work[i - 15] s0 = rotr(wi_15, 7) ^ rotr(wi_15, 18) ^ (wi_15 >> 3) wi_2 = work[i - 2] s1 = rotr(wi_2, 17) ^ rotr(wi_2, 19) ^ (wi_2 >> 10) wi_16 = work[i - 16] wi_7 = work[i - 7] work[i] = (wi_16 + s0 + wi_7 + s1) & 0xFFFFFFFF with rule(unroll='full'): for i in range(8): __h[i] = _h[i] for i in range(64): s0 = rotr(__h[0], 2) ^ rotr(__h[0], 13) ^ rotr(__h[0], 22) maj = (__h[0] & __h[1]) ^ (__h[0] & __h[2]) ^ (__h[1] & __h[2]) t2 = s0 + maj s1 = rotr(__h[4], 6) ^ rotr(__h[4], 11) ^ rotr(__h[4], 25) ch = (__h[4] & __h[5]) ^ ((~__h[4]) & __h[6]) t1 = __h[7] + s1 + ch + k[i] + work[i] __h[7] = __h[6] __h[6] = __h[5] __h[5] = __h[4] __h[4] = (__h[3] + t1) & 0xFFFFFFFF __h[3] = __h[2] __h[2] = __h[1] __h[1] = __h[0] __h[0] = (t1 + t2) & 0xFFFFFFFF with rule(unroll='full'): for i in range(8): _h[i] = (_h[i] + __h[i]) & 0xFFFFFFFF # print("turn %5t", count, "$time") rv256: bit256 = 0 with rule(unroll='full'): for i in range(8): rv256 <<= 32 rv256 |= _h[i] #print("rv256 %5t", rv256, "$time") self.data_out.wr(rv256)
def unroll02(start, stop, step): sum = 0 for i in unroll(range(start, stop, step), 5): sum += i return sum
def unroll10_c(xs:list): sum = 0 for i in unroll(range(1, 10, 2), 3): sum += xs[i] return sum
def unroll08_b(xs: list): sum = 0 for i in unroll(range(1, 9, 2), 2): sum += xs[i] return sum
def unroll03(x): sum = 0 for i in unroll(unroll(range(4))): sum += i return sum