Implementation of various deep learning models for limit order book. DeepLOB (Zhang et al., 2018), TransLOB (Wallbridge, 2020), DeepFolio (Sangadiev et al., 2020), etc.
LOBster is a project entitled <Limit order book (LOB) driven simultaneous time-series estimation in real-market-microstructure>, which is end-to-end machine learning pipeline to predict future mid-price using limit order book. Our project provides a source code of the machine learning pipeline that contains data processing, model training and inference. It contains an implementation of DeepLOB (Zhang, 2018) and our modified model.
We also provide an implementation of handling code for FI-2010 (Ntakaris et al., 2017), a publicly available benchmark dataset for mid-price forecasting for limit order book data. In addition, we provide a pre-processing tools for custom raw LOB dataset collected in real market microstructure. The pre-processing tool contains several useful functions, such as down-sampling, normalization and labeling.
Lastly, our project provides some modules that test the classification performance of trained model. Specially, it contains a simple market simulator that test whether inference of model works in real market microstructure. It tests the trading performance (i.e. cumulative profits) based of inference on the test set.
loaders.krx_preprocess.__normalize_data__
, you can normalize the raw collected data with three different methods: Z-scoring, min-max and decimal-precision normalization.loaders.krx_loader.__split_x_y__
, you can generate the label with any arbitrary predict horizon. More detailed labeling method is shown in following equation.
$$m_{-}(t)=\frac{1}{k} \sum_{i=0}^k p_{t-i}$$
$$m_{+}(t)=\frac{1}{k} \sum_{i=0}^k p_{t+i}$$
$$l_t=\frac{m_{+}(t)-m_{-}(t)}{m_{-}(t)}$$nvidia-smi
pip install requirements.txt
optimizers/hyperparams.yaml
to modify the hyperparameter setting. You can set the batch size, learning rate, epsilon, maximum epoch and number of workers to load dataset. Otherwise, the experiments will conduct under our fine-tuned hyperparameters.
[model name]:
batch_size: 128
learning_rate: 0.0001
epsilon: 1e-08
epoch: 30
num_workers: 4
main.py
to set the experiment parameters. Our base experiment setting already implements in the main.py
, so you don't have to modify it.
# experiment parameter setting
dataset_type = 'fi2010'
normalization = 'Zscore'
model_type = 'lobster'
lighten = True
T = 100
k = 4
stock = [0, 1, 2, 3, 4]
train_test_ratio = 0.7
dataset_type
: Dataset for experiment. You can select 'fi2010' or 'krx'. (only 'fi2010' is available in public demo version)normalization
: Normalization method. 'Zscore', 'MinMax', and 'DecPre' are available.lighten
: It determines whether the experiment uses the 10-level LOB data or 5-level reduced LOB data. If lighten is True, experiment will only use the 5-level reduced data. This parameter affects not only the input dataset, but also the architecture of the model.model_type
: Model used in experiment. 'deeplob' and 'lobster' is available.T
: Length of time window used in single input. T = 100 used in paper and our experiment.k
: Prediction horizion. For fi-2010, 0, 1, 2, 3, 4 is available, which indicates the 10, 20, 30, 50, 100 ticks of horizon. For krx, any prediction horizon is available.stock
: Stock dataset used in experiment. For FI-2010, [0, 1, 2, 3, 4] are available, which indicates corresponding individual stocks. For KRX, ['KS200', 'KQ150'] are available. You can use multi-stocks for single experiments.train_test_ratio
: Ratio to split the training set and test set. For example, if the train_test_ratio is 0.7, the early 0.7 days data are used for training set and the late 0.3 days data are used for test set.python main.py
main.py
, it will automatically generate a unique ID for each experiment and print it. It includes some information for experiment, such as model type and experiment datetime (ex. lobster-lighten_2022-12-03_10:34:05). The trained model and all the corresponding result will save in loggers/results/[model id]
.Our reference papers are listed in References.md.