The world's first wait-free Software Transactional Memory
OneFile is a Software Transactional Memory (STM) meant to make it easy to implement lock-free and wait-free data structures. It is based on the paper "OneFile: A Wait-free Persistent Transactional Memory" by Ramalhete, Correia, Felber and Cohen https://github.com/pramalhe/OneFile/blob/master/OneFile-2019.pdf
It provides multi-word atomic updates on tmtype<T> objects, where T must be word-size, typically a pointer or integer. During a transaction, each store on an tmtpye<T> is transformed into a double-word-compare-and-swap DCAS() and one more regular CAS() is done to complete the transaction. It does this with a store-log (write-set) which other writers can help apply. This is a "redo-log" based technique, which means that both store and loads need to be interposed. Stores will be interposed to save them in the log and loads will be interposed to lookup on the log the most recent value. If there is a transaction currently ongoing, the readers will have to check on each tmtype::pload() if the variable we're tying to read is part of the current transaction. If a value is read whose 'seq' is higher than the transaction we initially read, the whole read-only operation will be restarted, by throwing an exception in the tmtype::pload() interposing method and catching this exception by the TM. All of this logic is handled internally by OF without any explicit user interaction.
Because of operator overloading, the assignment and reading of tmtype<T> types is done transparently to the user, with a pure library implementation, without any need for compiler instrumentation. This means that the user can write the code as if it was a sequential implementation of the data structure, apart from the change of types (type annotation). In this sense, OneFile is a "quasi-universal construction" with lock-free progress.
Our design goal with OneFile was to provide a non-blocking STM so that non-experts could implement their own lock-free and wait-free data structures. OneFile is not designed to transform regular everyday code into lock-free applications. Such use-cases require a lot more of engineering work and likely a completely different approach from what we took with OneFile (CX is a much better option for that purpose).
We've made two implementations in the form of Persistent Transactional Memory (PTM) which are STMs meant for Persistent Memory, like Intel's Optane DC Persistent Memory.
We've implemented four diferent variants of this design:
See the respective .hpp files for implementation details.
Each implementation is a single header file. Yes, it's that small :)
If you just want to use OneFile in your own application or benchmarks then follow these steps:
Choose one of the four OneFile implementations, depending on whether you want and STM, a PTM, lock-free or wait-free progress:
stms/OneFileLF.hpp STM with lock-free transactions
stms/OneFileWF.hpp STM with wait-free transactions
ptms/POneFileLF.hpp PTM with lock-free transactions
ptms/POneFileWF.hpp PTM with wait-free transactions
Copy the header to your development folder
Include the header from a single .cpp. If you include from multiple compilation units (.cpp files) then move the last block in the .hpp to one of the .cpp files.
If you want a data structure that is already made then take a look at what's on these folders:
datastructures/ Data structures for volatile memory (needs one of the STMs) pdatastructures/ Data structures for persistent memory (needs one of the PTMs)
In OneFile STM a transaction goes through three phases. The first phase is to convert the operation (lambda) into a store-log (write-set). There is no need to save the loads (read-set) because unlike other approaches, a transaction does not need to re-check for changes at commit time: it does a check in-flight on each load of whether or not the value has changed since the beginning of the transaction, by looking at a sequence number associated with every read value, a technique similar to TL2 or TinySTM but without the need for keeping a read-set because all write transactions are executed one at a time, effectively serialized. The second phase is to commit the transaction by advancing the current transaction (curTx). The third phase is to apply the store-log using DCAS.
The first phase is implicitly serializable. Even if each thread publishes its operation, there is no way to parallelize this work among threads. The best that could be done would be that each thread to transform its own operation into its own store-log which it then appends to a global store-log. Unfortunately this is possible only for disjoint-access parallel transactions, and these are not easy to detect, therefore, our implementation of OneFile does not do this. Instead, we attempt to parallelize the second stage, where the store-log is applied. This task is easier to split among multiple threads, thus parallelizing it. Adding Flat-Combining or other similar aggregation techniques to the first stage, means that each thread will produce a store-log containing the operations of all other threads. This can be a bottleneck if the operations involves heavy computations and small store-logs. For data structures, this is not the case, and OneFile is designed to implement and work with data structures or other scenarios where the transactions are short in time, therefore, we found it acceptable to go with such an approach.
The parallelization of the third phase can be done with at least two different approaches: blocking and non-blocking. In the blocking approach, the store-log can be divided into chunks (for example, one chunk per thread), each chunk having a lock, and the thread that takes the lock is responsible for applying that chunk. In the non-blocking approach (OF-LF and OF-WF), each thread tries to apply an entry of the store-log at a time. To avoid ABA issues, a double-word compare-and-swap (DCAS) must be used.
In summary, OneFile does not do disjoint-access parallel transactions. If you absolutely need that functionality, then go and take a look at TinySTM.
We're using a customized implementation of Hazard Eras, lock-free/wait-free memory reclamation:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/papers/hazarderas-2017.pdf
https://dl.acm.org/citation.cfm?id=3087588
See the HazardErasOF class in each implementation for more details.
As far as we know, there is only one wait-free data structure that has integrated wait-free memory reclamation:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/papers/crturnqueue-2016.pdf
OneFile and CX are the first time that someone has made a generic mechanism for wait-free memory reclamation.
The biggest advantage of all is that it's way easier to use OneFile than it is to implement a hand-made lock-free or wait-free data structure.
There are some working examples in the "datastructures/" folder: OFLFBoundedQueue.hpp: An array based queue (memory-bounded) OFLFLinkedListQueue.hpp: A linked list based queue (memory unbounded) OFLFLinkedListSet.hpp: A linked list based set OFLFRedBlackBST.hpp: A Red-Black (balanced) tree map
To build the benchmarks you need to build ESTM and TinySTM, and then you need to pull PMDK (PMEM/NVML) and build it:
cd ~/onefile/stms/
cd estm-0.3.0
make clean ; make
cd ..
cd tinystm
make clean ; make
cd ..
cd ~
git clone https://github.com/pmem/pmdk.git
cd pmdk
make -j12
sudo make install
export PMEM_IS_PMEM_FORCE=1
cd ~/onefile/graphs
make -j12
The four implementations of OneFile were executed during thousands of cpu hours and heavily stress tested with invariant checking and using tools like address sanitizer and valgrind. This is a lot more than what other STMs on github provide, but it doesn't mean there are no bugs in it ;) If you see a crash or invariant failure, run the same code under a global rw-lock to make sure the bug is not in your code. If you really believe it's in OneFile, then please open a bug on github and add as much information as you can, namely, stack trace and files needed to reproduce. We'll do our best to address it.