Python module for data scientists for quick creating annotation projects.
Trunklucator is a python module for data scientists and ML practitioners for quick creating annotation projects and testing your ideas. It acts like a python's native input() function, but support displaying rich content and advance interaction with the user (using a web browser). Trunklucator lets you easily plug interaction with a human to your model prototype.
from trunklucator import WebUI
with WebUI() as webui: # start http server in background
for item in data:
y = webui.ask(item) #<- wait for user action on web page
print(y)
For full examples see examples/quickstart
directory
Task | Screenshot | Example code |
---|---|---|
binary classification | ![]() ![]() |
For images - examples/quickstart/binary_class_image.py For text - examples/quickstart/binary_class_text.py |
multiclass classification | ![]() |
examples/quickstart/multi_class_text.py |
multilabel classification | ![]() |
examples/quickstart/multi_label_text.py |
Named Entity Recognition (NER) | ![]() |
examples/quickstart/ner_text.py |
HTML page annotation | ![]() |
examples/quickstart/ner_html.py |
Trunklucator is the best when you need to represent complex data like image, formatted text, video or sound to the user and ask the user to label/annotate this data. After a user's action, you immediately are able to use this data in your pipeline. Trunklucator works well together with active learning (see example https://github.com/Dumbris/pytorch_active_learning/active_learning_basics.py).
pip install trunklucator
You can use environmet variable to change default parameters
PORT=8080 python3 main.py
Also, you can use similar parameters in code then instanciate trunklucator.WebUI class.
with WebUI(host='0.0.0.0', port=8080, data_dir='./data', frontend_dir='./myfront')
For instance of WebUI class:
.ask(data, meta(optional))
- by calling this method you will stop the execution of your code until the user action in a web browser..update(data)
- asynchronously publish information to the frontend part.Trunklucator contains two parts: python module which runs a small HTTP server in the background thread and frontend - it could be any javascript single page application that supports simple protocol for fetching task data.
These parts interact with each other using HTTP or WebSocket. You don't need to change the python part it's ready to use abstraction.
You can select which frontend part to use by setting frontend_dir
WebUI init parameter or using environment variable FRONTEND_DIR
You can set path to your custom frontend directory or use predefined names for frontends integrated into python package.
In the current version there are two frontend integrated:
WebUI(frontend_dir='html_field')
designed like hackable part, you can adjust it for your specific data format. The default implementation is able to load arbitrary HTML text. UI controls can be configured in python code.WebUI(frontend_dir='label_studio')
- advanced frontend with a support a lot of data types. For more information check the official site https://labelstud.io/ and example/quickstart/ner_text.pyTo customize default frontend part: