High performance gradient boosting for Ruby
LightGBM - high performance gradient boosting - for Ruby
Add this line to your application’s Gemfile:
gem "lightgbm"
On Mac, also install OpenMP:
brew install libomp
Prep your data
x = [[1, 2], [3, 4], [5, 6], [7, 8]]
y = [1, 2, 3, 4]
Train a model
params = {objective: "regression"}
train_set = LightGBM::Dataset.new(x, label: y)
booster = LightGBM.train(params, train_set)
Predict
booster.predict(x)
Save the model to a file
booster.save_model("model.txt")
Load the model from a file
booster = LightGBM::Booster.new(model_file: "model.txt")
Get the importance of features
booster.feature_importance
Early stopping
LightGBM.train(params, train_set, valid_sets: [train_set, test_set], early_stopping_rounds: 5)
CV
LightGBM.cv(params, train_set, nfold: 5, verbose_eval: true)
Prep your data
x = [[1, 2], [3, 4], [5, 6], [7, 8]]
y = [1, 2, 3, 4]
Train a model
model = LightGBM::Regressor.new
model.fit(x, y)
For classification, use
LightGBM::Classifier
Predict
model.predict(x)
For classification, use
predict_proba
for probabilities
Save the model to a file
model.save_model("model.txt")
Load the model from a file
model.load_model("model.txt")
Get the importance of features
model.feature_importances
Early stopping
model.fit(x, y, eval_set: [[x_test, y_test]], early_stopping_rounds: 5)
Data can be an array of arrays
[[1, 2, 3], [4, 5, 6]]
Or a Numo array
Numo::NArray.cast([[1, 2, 3], [4, 5, 6]])
Or a Rover data frame
Rover.read_csv("houses.csv")
Or a Daru data frame
Daru::DataFrame.from_csv("houses.csv")
This library follows the Python API. A few differences are:
get_
and set_
prefixes are removed from methods-1
cv
method, stratified
is set to false
Thanks to the xgboost gem for showing how to use FFI.
View the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
To get started with development:
git clone https://github.com/ankane/lightgbm-ruby.git
cd lightgbm-ruby
bundle install
bundle exec rake vendor:all
bundle exec rake test