Python Automated Machine Learning library for tabular data.
Simple but powerful Automated Machine Learning library for tabular data. It uses efficient in-memory SAP HANA algorithms to automate routine Data Science tasks.
π Explore the docs Β»
π Report Bug
Β·
π Request Feature
The project has been frozen for uncertain time π₯Ά. However, you can still use our web-app. Also, this library is an open-source research project and is not a part of any official SAP products.
This is a simple but accurate Automated Machine Learning library. Based on SAP HANA powerful in-memory algorithms, it provides high accuracy in multiple machine learning tasks. Our library also uses numerous data preprocessing functions to automate routine data cleaning tasks. So, hana_automl goes through all AutoML steps and makes Data Science work easier.
From www.sap.com: SAP HANA is a high-performance in-memory database that speeds data-driven, real-time decisions and actions.
https://share.streamlit.io/dan0nchik/sap-hana-automl/main/web.py
https://sap-hana-automl.readthedocs.io/en/latest/index.html
https://github.com/dan0nchik/SAP-HANA-AutoML/blob/main/comparison_openml.ipynb
π By the end of summer 2021, blue part will be fully automated by our library
Streamlit client
To get a package up and running, follow these simple steps.
Make sure you have the following:
β
Setup SAP HANA (skip this step if you have an instance with PAL enabled).
There are 2 ways to do that.
In HANA Cloud:
In Virtual Machine:
β Installed software
python --version
returns > 3.6pip3 install Cython
There are 2 ways to install the library
pip3 install hana_automl
pip3 install https://github.com/dan0nchik/SAP-HANA-AutoML/archive/dev.zip
Note: latest version may contain bugs, be careful!Check that PAL (Predictive Analysis Library) is installed and roles are granted
from hana_automl.utils.scripts import setup_user
from hana_ml.dataframe import ConnectionContext
cc = ConnectionContext(address='address', user='user', password='password', port=39015)
# replace with credentials of user that will be created or granted a role to run PAL.
setup_user(connection_context=cc, username='user', password="password")
Our library in a few lines of code
Connect to database.
from hana_ml.dataframe import ConnectionContext
cc = ConnectionContext(address='address',
user='username',
password='password',
port=1234)
Create AutoML model and fit it.
from hana_automl.automl import AutoML
model = AutoML(cc)
model.fit(
file_path='path to training dataset', # it may be HANA table/view, or pandas DataFrame
steps=10, # number of iterations
target='target', # column to predict
time_limit=120 # time limit in seconds
)
Predict.
model.predict(
file_path='path to test dataset',
id_column='ID',
verbose=1
)
For more examples, please refer to the Documentation
git clone https://github.com/dan0nchik/SAP-HANA-AutoML.git
pip3 install Cython
pip3 install -r requirements.txt
streamlit run ./web.py
See the open issues for a list of proposed features (and known issues). Feel free to report any bugs :)
Any contributions you make are greatly appreciated π!
Fork the Project
Create your Feature Branch (git checkout -b feature/NewFeature
)
Install dependencies
pip3 install Cython
pip3 install -r requirements.txt
Create credentials.py
file in tests
directory
Your files should look like this:
SAP-HANA-AutoML
β README.md
β all other files
β .....
|
ββββtests
β test files...
β credentials.py
Copy and paste this piece of code there and replace it with your credentials:
host = "host"
user = "username"
password = "password"
port = 39015 # or any port you need
schema = "your schema"
Don't worry, this file is in .gitignore, so your credentials won't be seen by anyone.
Make some changes
Write tests that cover your code in tests
directory
Run tests (under SAP-HANA-AutoML directory
)
pytest
Commit your changes (git commit -m 'Add some amazing features'
)
Push to the branch (git push origin feature/AmazingFeature
)
Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Don't really understand license? Check out the MIT license summary.
Authors: @While-true-codeanything, @DbusAI, @dan0nchik
Project Link: https://github.com/dan0nchik/SAP-HANA-AutoML