BlockSci Versions Save

A high-performance tool for blockchain science and exploration

v0.7.0

3 years ago

Version 0.7.0 is based on the development branch v0.6, but is not compatible with v0.5 or v0.6 parsings and requires a full reparse of the blockchain.

Notable Changes

  • Python: New fluent interface

    A new fluent Python interface allows to execute many operations (such as filtering transactions) efficiently in C++, resulting in a major performance increase.

  • Parser support for CTOR (Canonical Transaction Ordering Rule)

    BlockSci's parser has been updated to support arbitrary transaction ordering rules within a block (e.g., Bitcoin Cash's canonical transaction ordering).

  • Transaction Input<->Output mapping

    Inputs now reference the output they are spending, and vice versa. These can be looked up using blocksci.Output.spending_input and blocksci.Input.spent_output

  • New config files to store configurations for different blockchains

    Settings for blockchain parsings are now stored in a JSON config file. Find more information in the setup instructions

  • Testing: Python test suite and CI have been added

    We've added a small test suite for the Python interface. It uses a special regtest blockchain created using our testchain generator. Current test coverage is limited, and we welcome contributions to extend it.

Important bug fixes

  • v0.5 contains a bug that causes reused addresses to receive a new ID, rather than the previously assigned ID. If the address had been used multiple times before, subsequent occurrences would receive the old ID again, resulting in lookups that show only one transaction associated with the address. If the address had been used only once before, a lookup will miss this previous occurrence. Bitcoin parsings beyond block height 572072 are affected. You can find more information in this document. (Issue #272)
  • The parser now builds the index to look up wrapping addresses. Previously, when retrieving the equiv address from a wrapped address (e.g., P2PK that is wrapped by a P2SH address), it would not include the wrapping address. (PR #402)
  • v0.6 only: incorrect handling of compressed public keys resulted in multisig addresses not being correctly deduplicated (i.e. they would receive a new ID on reuse). (Fixed on 01/31/2020, PR #367)

Other changes and bug fixes

  • Multisig addresses with invalid public keys are considered non-standard
  • Updated dependencies (including range-v3, pybind11 and RocksDB)
  • blocksci.Address.[ins|outs|in_txes|out_txes|txes] now return iterators
  • The blocksci.heuristics.change.ChangeHeuristic interface has been rewritten.
    • Heuristics now return an blocksci.OutputIterator instead of a set of outputs.
    • ChangeHeuristic.unique_change now returns a new ChangeHeuristic object, allowing to use it to compose with other change heuristics.
    • A new None heuristic has been added and is also the default heuristic for change address clustering (effectively disabling it).
    • A new Spent heuristic allows to refine heuristics that return unspent outputs as potential change outputs.
    • Clustering no longer performs change address clustering by default. You can still specify a change address heuristic to enable it.
  • Added a new tool to the parser (blocksci config.json doctor) that can detect a few common issues with the setup
  • Added a new tool (blocksci_check_integrity) that computes a hash value over the BlockSci data produced by the parser
  • Correctly handle Schnorr signatures on Bitcoin Cash with a length of 65 bytes (PR #395)
  • Changed in(s)/out(s) in method/property names to input(s)/output(s) to avoid confusion with incoming and outgoing funds (PR #392)
  • Fixed an inconsistency in recording the tx that first spends a script (PR #385)
  • chain.cpp.filter_tx has been removed in favor of the new fluent interface (Issue #254)
  • Recognize address formats that use more than one version byte (Issue #246)
  • The parser will detect if another instance is already running on the same data directory (Issue #211)
  • blocksci.cluster.ClusterManager.create_clustering now accepts a start and end height for clustering only a specific block range (does not apply to linking of wrapped with wrapping addresses) (Issue #118)
  • Fixed rounding inconsistencies for values in Zcash (Issue #117)
  • Added Witness Unknown address type support (Issue #112)
  • Added transaction version numbers (Issue #92)

Known bugs and limitations

  • Performance of directly accessing addresses and iterators/ranges in the Python interface is slower than in v0.5 (only noticeable when accessing them in large volumes)
  • Iterating over an AddressIterator in pure Python causes a segfault. Use .to_list() to retrieve a list of the results over which you can iterate.

v0.5.1

5 years ago

Feature Enhancements

  • Expanded iterator and range functionality to return NumPy arrays.

    Many methods and properties of BlockSci objects return range or iterator objects such as blocksci.TxRange. These objects allow vectorized operations over sequences of BlockSci objects. Their API matches up with the API of their member objects, and thus blocksci.TxRange has almost the same set of methods as blocksci.Tx. These methods will efficiently call the given method over all items in the range or iterator. Depending on the return type of the method, the result will either be another range, a NumPy array, or a python list. For further information, look for these classes in the reference.

  • Add custom BlockSci pickler to enable sending and receiving serialized BlockSci objects. This means that returning BlockSci objects from the multiprocessing interface now works correctly.

  • Enhance the change address heuristics interface

    Change address heuristics are now composible in order to form new customized heuristics using the blocksci.heuristics.change.ChangeHeuristic interface. These can be used in combination with the new clustering interface described below.

  • Incorporate clustering module into main BlockSci library

    The formerly external clustering module is now avaiable as blocksci.cluster. Further, it is now possible to generate new clusterings through the python interface using the ~blocksci.cluster.ClusterManager.create_clustering method. Users can select their choice of change address heuristic in order to experiment with different clustering strategies.

  • Simplified build system

    BlockSci's install process no longer requires the compilation of any external dependencies to compile on Ubuntu 16.04. The BlockSci library no longer has any public dependencies so compiling against it will not require linking against anything else.

    The CMake build script has now been updated to install a Config file which allows you to use find_package(blocksci) to import BlockSci's target's into your build script. This makes it much easier to build libraries that use BlockSci as a dependency.

    The BlockSci python module has been moved into a separate module to allow for a simple SetupTools or pip based install process: pip install -e pyblocksci. The main BlockSci library must be installed first for this to work.

    Finally, install instructions for the mac have been added along with Ubuntu 16.06 instructions.

  • Updated mempool recorder and integrated it into BlockSci interface.

    For instructions on running the mempool recorder and using the data it produces, see the setup section.

  • Improve and clean up auto generated API reference.

    All method signatures display correct types and all properties display the type of the returned value. Further, all types link to their definition in the documentation.

Bug Fixes

v0.4.5

6 years ago

Version 0.4.5

Feature Enhancements

  • Safe incremental updates

    Following an number of enhancements BlockSci is now capable of safely performing incremental updates. The AWS distribution of BlockSci now includes a Bitcoin full node and will automatically update the BlockChain once per hour. For local installations of BlockSci, see the readme for setup instructions.

  • Introduced new concept of Equivalent Addresses which includes two types of equivalences, Type Equivalent and Script Equivalent. Type equivalent refers to two addresses using the same secret in a different way such as how a single pubkey could be used for a Pay to Pubkey Hash address and a Pay To Witness Pubkey Hash address. Script Equivalent refers to a Pay tp Script Hash address being equivalent to the address it contains. Address.equiv() and the EquivAddress class were added to support these concepts. See the documentation for more information.

  • Enabled the opening of multiple Blockchain objects in the same notebook by removing internal usage of Singleton pattern.

  • Proper handling of segwit tx and block size distinctions. This included updating the parser to store the size of each transaction excluding segwit data and as as supporting the 3 new notions of size that segwit introduced.

  • Proper handling of bech32 addresses.

    • Blockchain.address_from_string() now supports lookup of bech32 addresses.

    • Address objects now display the correct human readable address depending on the address type.

  • Improved initial chain parsing from 24 hours down to 12 hours and reduced in parser data size due to unification of the hash index database and parser address hash index database.

Breaking Changes

  • Updated to new data version for the parser output requiring a rerun of the blocksci_parser.

  • In order to allow multiple blockchain objects. All constructors and factory methods were removed with parallel methods added to the chain object. For instance Tx(hash) is now chain.tx_with_hash(hash).

  • Removed Address.script and merged its functionality into Address

  • Modified Address.outs(), Address.balance(), and related functions to only return results for places on the Blockchain where that address appeared in a top level context (Not wrapped inside another address).

  • Renamed various methods from using script in their name to address in order to reflect updated terminology.

  • Removed ScriptType since its functionality was superseded by EquivAddress

Bug Fixes

  • Fixed segwit size handling as stated above. (Issue #43_)
  • Fixed chain.filter_txes (Issue #50_)
  • Fixed P2SH API issued. (Issue #53_)

.. _Issue #43: https://github.com/citp/BlockSci/issues/43 .. _Issue #50: https://github.com/citp/BlockSci/issues/50 .. _Issue #53: https://github.com/citp/BlockSci/issues/53

v0.4

6 years ago

Version 0.4 introduces full bech32 address support, adds segwit size support, and fixes a bug which had been preventing use of continuous incremental blockchain updates.

v0.3

6 years ago

This release of BlockSci adds SegWit support along with numerous other changes and fixes

v0.2

6 years ago

This was this initial public release of BlockSci