pack64

Pack64 is a vector encoding, with code for encoding and decoding it in Python, Ruby, and JavaScript. It packs a vector into a kind-of-floating-point, kind-of-base64 representation, requiring only 3 bytes per vector entry.

This is meant for transmitting vector data over a network, in a situation where:

Arbitrary bytes can't be transmitted
We need to send the vector in as few bytes as possible
Simply base64-encoding floating-point data -- at 5.33 bytes per entry -- isn't small enough
A loss of precision is acceptable, as long as the properties of the vector remain the same

Possible applications include rapidly updating a vector using STOMP, or encoding a vector in a URL.

If you wonder why you'd need this, the fact is that you might not. But at Luminoso, we send lots of vectors over the network.

Specifications

pack64

The pack64 function returns a string b of bytes, representing digits using the URL-safe base64 character set (A-Z, a-z, 0-9, -, _), as follows:

b[0] contains the power-of-two exponent, biased by 40. That is:
- An exponent of 0 ("A") means to multiply all the integers that follow by 2^-40.
- An exponent of 30 ("e") means to multiply the integers by 2^-10 (that is, divide them by 1024).
- An exponent of 40 ("o") means to leave the integers as is.
- An exponent of 63 ("_") means to multiply the integers by 2^23.
Call this number 2^(b[0] - 40) the increment.
The increment is chosen to maximize precision. To choose the increment, find the number "a" such that 2^a is larger than all the magnitudes in the vector. The correct value for the increment is then 2^(a - 17). However, if this gives an increment lower than 2^-40, use 2^-40 instead.
b[1:4], b[4:7], etc. contain the values in the vector, as 18-bit, big-endian, twos-complement integers, which will all be multiplied by the increment. That is:
- AAA represents 0.
- AAB represents 1.
- AAC represents 2.
- ___ represents -1.
- __- represents -2.
- f__ represents the highest possible value, (2^17 - 1).
- gAA represents the lowest possible value, -(2^17).

The last value will be found in b[3*K-2 : 3*K+1], so the length of the string overall will be 3*K + 1.

This encoding can represent positive, negative, or zero values, with magnitudes from 2^-40 to approximately 2^40, as long as the other values in the vector are comparable in magnitude.

If the vector contains infinity or NaN, then this will raise a ValueError; you probably don't want to transmit that sort of meaningless vector. If it contains an entry of magnitude 2^40 or greater, it raises an OverflowError.

unpack64

Unpacks a vector that has been encoded with the pack64 function.

Takes in a string of 3n+1 URL-safe base64 characters, and returns a vector of length n, such that unpack64(pack64(vec)) is equal to vec within five significant digits of precision.

In Python, this returns a NumPy vector of dtype np.float32. In JavaScript, it returns a standard array of numbers.

Precision

Each entry in a pack64 vector is specified with 18 bits of precision. This is 3/4 of the precision of an IEEE 754 single-precision float, which has 24 bits of precision.

The important difference is that every value in the vector is specified with the same level of granularity. The exponent defines 2^18 possible values for each entry, and every entry chooses from the same 2^18 values.

This will lose precision in entries of a vector that are much smaller than other entries, possibly even rounding them to 0. However, when the larger entries are accurately represented, the precision of the small entries matters less.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
javascript		javascript
python		python
ruby		ruby
.gitignore		.gitignore
LICENSE		LICENSE
README.markdown		README.markdown

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

javascript

javascript

python

python

ruby

ruby

.gitignore

.gitignore

LICENSE

LICENSE

README.markdown

README.markdown

Repository files navigation

pack64

Specifications

pack64

unpack64

Precision

License

About

Releases

Packages

License

pombredanne/pack64

Folders and files

Latest commit

History

Repository files navigation

pack64

Specifications

pack64

unpack64

Precision

License

About

Resources

License

Stars

Watchers

Forks