A Python module to help you manage your bits
Fixing a few regressions introduced in 4.2.0.
This release contains a fairly large refactor of how different types are managed. This
shouldn't affect the end user, and the main noticeable change should be the new Dtype
class, which is optional to use.
Support for 8-bit and smaller floats has been reworked and expanded. These are still a 'beta' feature.
Backwardly incompatible changes:
'uint:foo'
instead of 'uintfoo'
.auto
in constructors.Other changes:
The Array class is no longer 'beta'.
A new Dtype class can be optionally used to specify types.
The bitstring.options object is now the preferred method for changing module options.
The bitstring.lsb0
and bitstring.bytealigned
variables are now deprecated, use
bitstring.options.lsb0
and bitstring.options.bytealigned
instead.
New fromstring method as another way to create bitstrings from formatted strings.
Instead of relying on the auto
parameter you can now optionally use fromstring
.
>>> s1 = BitArray('u24=1000') # This is still fine
>>> s2 = BitArray.fromstring('u24=1000') # This may be clearer and more efficient.
>>> s.pp('u15, bin')
Pretty printing is now prettier - optional terminal colours added.
A range of 8-bit, 6-bit and even 4-bit float formats added (beta):
p3binary8
: IEEE 8-bit floating point with 3 bit precision.
p4binary8
: IEEE 8-bit floating point with 4 bit precision.
e5m2mxfp
: OCP 8-bit floating point with 3 bit precision.
e4m3mxfp
: OCP 8-bit floating point with 4 bit precision.
e2m3mxfp
: OCP 6-bit floating point with 4 bit precision.
e3m2mxfp
: OCP 6-bit floating point with 3 bit precision.
e2m1mxfp
: OCP 4-bit floating point with 2 bit precision.
e8m0mxfp
: OCP 8-bit unsigned floating point designed to scale the other formats.
mxint
: OCP 8-bit floating point that is a scaled integer representation.
Performance improvements.
Fixing a regression introduced in 4.1.3
A maintenance release, with some changes to the beta features introduced in 4.1.
Another maintenance release. Once again some small changes to the 'beta' Array class, plus new Array functionality.
==
will now return an Array (see above item).A maintenance release, with some changes to the Array class which is still in 'beta'.
This has turned into a suprisingly big release, with a major refactor and a brand new class (the first for 12 years!) There are also a couple of small possibly breaking changes detailed below, in particular 'auto' initialising bitstrings from integers is now disallowed.
The major weakness of bitstring has been its poor performance for computationally intensive tasks relative to lower level alternatives. This was principally due to relying on pure Python code to achieve things that the base language often didn't have fast ways of doing.
This release starts to address that problem with a fairly extensive rewrite to replace much of the pure Python low-level bit operations with methods from the bitarray package. This is a package that does many of the same things as bitstring, and the two packages have co-existed for a long time. While bitarray doesn't have all of the options and facilities of bitstring it has the advantage of being very fast as it is implemented in C. By replacing the internal datatypes I can speed up bitstring's operations while keeping the same API.
Huge kudos to Ilan Schnell for all his work on bitarray.
If your data is all of the same type you can make use of the new Array class, which mirrors much of the functionality of the standard array.array type, but doesn't restrict you to just a dozen formats.
>>> from bitstring import Array
>>> a = Array('uint7', [9, 100, 3, 1])
>>> a.data
BitArray('0x1390181')
>>> b = Array('float16', a.tolist())
>>> b.append(0.25)
>>> b.tobytes()
b'H\x80V@B\x00<\x004\x00'
>>> b.tolist()
[9.0, 100.0, 3.0, 1.0, 0.25]
The data is stored efficiently in a BitArray object, and you can manipulate both the data and the Array format freely. See the main documentation for more details. Note that this feature carries the 'beta' flag so may change in future point versions.
>>> a = Bits(float8_143=16.5)
>>> a.bin
'01100000'
>>> a.float8_143
16.0
>>> a = BitArray(100) # Fine - create with 100 zeroed bits
>>> a += 0xff # TypeError - previously this would have appended 0xff (=255) zero bits.
>>> a += '0xff' # Probably what was meant - append eight '1' bits.
>>> a += Bits(255) # Fine, append 255 zero bits.
This is a breaking change, but it breaks loudly with an exception, it is easily recoded, and it removes a confusing wrinkle.
__auto
. In the unlikely event
this breaks code, the fix should be just to delete the auto=
if it's already the
first parameter. >>> s = Bits(auto='0xff') # Now raises a CreationError
>>> s = Bits('0xff') # Fine, as always
Deleting, replacing or inserting into a bitstring resets the bit position to 0 if the bitstring's length has been changed. Previously the bit position was adjusted but this was not well defined.
Only empty bitstring are now considered False in a boolean sense. Previously s was
False is no bits in s were set to 1, but this goes against what it means to be a
container in Python so I consider this to be a bug, even if it was documented. I'm
guessing it's related to __nonzero__
in Python 2 becoming __bool__
in Python 3, and
it's never been fixed before now.
Casting to bytes now behaves as expected, so that bytes(s)
gives the same result as
s.tobytes()
. Previously it created a byte per bit.
Pretty printing with the 'bytes' format now uses characters from the 'Latin Extended-A' unicode block for non-ASCII and unprintable characters instead of replacing them with '.'
When using struct-like codes you can now use '=' instead of '@' to signify native- endianness. They behave identically, but the new '=' is now preferred.
More fixes for LSB0 mode. There are now no known issues with this feature.
A maintenance release.
This is a major release which drops support for Python 2.7 and has a new minimum requirement of Python 3.7. Around 95% of downloads satisfy this - users of older versions can continue to use bitstring 3.1, which will still be supported with fixes, but no new features.
Other changes are minimal, with a few features added.
Type hints added throughout the code.
Underscores are now allowed in strings representing number literals.
The copy() method now works on Bits as well as BitArray objects.
The experimental command-line feature is now official. Command-line
parameters are concatenated and a bitstring created from them. If
the final parameter is either an interpretation string or ends with
a .
followed by an interpretation string then that interpretation
of the bitstring will be used when printing it. ::
$ python -m bitstring int:16=-400
0xfe70
$ python -m bitstring float:32=0.2 bin
00111110010011001100110011001101
New pp() method that pretty-prints the bitstring in various formats - useful especially in interactive sessions. Thanks to Omer Barak for the suggestion and discussion.
>>> s.pp()
0: 10001000 01110110 10001110 01110110 11111000 01110110 10000111 00101000
64: 01110010 11111001 10000111 10011000 11110111 10011110 10000111 11111101
128: 11111001 10001100 01111111 10111100 10111111 11011011 11101011 11111011
192: 1100
>>> s.pp('bin, hex')
0: 10001000 01110110 10001110 01110110 11111000 01110110 88 76 8e 76 f8 76
48: 10000111 00101000 01110010 11111001 10000111 10011000 87 28 72 f9 87 98
96: 11110111 10011110 10000111 11111101 11111001 10001100 f7 9e 87 fd f9 8c
144: 01111111 10111100 10111111 11011011 11101011 11111011 7f bc bf db eb fb
192: 1100 c
Shorter and more versatile properties. The bin, oct, hex, float, uint and int properties can now be shortened to just their first letter. They can also have a length in bits after them - allowing Rust-like data types. ::
>>> s = BitArray('0x44961000')
>>> s.h
'44961000'
>>> s.f32
1200.5
>>> s.u
1150685184
>>> s.i7 = -60
>>> s.b
'1000100'
>>> t = Bits('u12=160, u12=120, b=100')
Support for IEEE 16 bit floats. Floating point types can now be 16 bits long as well as 32 and 64 bits. This is using the 'e' format from the struct module.
Support for the bfloat format. This is a specialised 16-bit floating point format mostly used in machine learning. It is essentially a truncated IEEE 32-bit floating point number that has the same range but much less accuracy.
Removed requirement to have a colon before lengths in format strings. So for example
'uint:12=100'
can be just 'uint12=100'
. The colon is still recommended for
readability if the length isn't given as a number literal.
Pulled due to a bug when using Python 3.7.