Everything you need about Active Learning (AL).
Papers • Introduction • Tutorials • Survey • Problem Settings • Theory
Dissertations • Code & Library • Scholars • Applications
Contributing - 加入本项目
If you find any valuable researches, please feel free to pull request or contact [email protected] to update this repository. Comments and suggestions are also very welcome!
By conference - 按会议分类:ICML / NeurIPS / ICLR / AAAI / IJCAI / ACL / CVPR / ICCV
By journal - 按期刊分类:AI / TPAMI / IJCV / JMLR
By degree - 按学位论文分类:Master / PhD
Constructed in a problem-orientated approach, which is easy for users to locate and track the problem. 基于以问题为导向的分类方式,以方便读者准确定位以及跟踪相关问题。
Problem - 面向的问题: High labeling cost is common in machine learning community. Acquiring a heavy number of annotations hindering the application of machine learning methods.
Essence / Assumption - 本质 / 基础假设: Not all the instances are equally important to the desired task, so only labeling the more important instances might bring cost reduction.
When we talk about active learning, we talk about - 当我们在谈论主动学习时,我们指的是:
There have been several reviews / surveys / benchmarks for this topic.
Lecture Topic | Year | Lecturer | Occasion |
---|---|---|---|
Active learning and transfer learning at scale with R and Python | 2018 | - | KDD |
Active Learning from Theory to Practice | 2019 | Robert Nowak & Steve Hanneke | ICML |
Overview of Active Learning for Deep Learning | 2021 | Jacob Gildenblat | Personal Blog |
Almost all the AL studies are based on the following scenarios. The difference lies in the different sources of the quired samples. The details of these scenarios could see here.
Three scenarios and corresponding tasks:
There are many variants of machine learning problem settings with more advanced tasks. Under these problem settings, AL could be further applied.
Related AL Fields:
Use AL to reduce the cost of annotation in many other AI research fields, where the tasks beyonds simple classification or regression. They either acquire different types of outputs or assume a unusual learning process. So AL algorithms should be revised/developed for these problem settings.
Utilize AL in the following fields (hot topics):
There have been many theoretical supports for AL. Most of them focus on finding a performance guarantee or the weakness of AL selection.
(This section has not been finished yet. 本章节当前还未完成.)
Many researches of AL are built on very idealized experimental setting. When AL is used to real life scenarios, the practical situations usually do not perfectly match the assumptions in the experiments. These changes of assumptions lead issues which hinders the application of AL. In this section, the practical considerations are reviewed under different assumptions.
The considerations of: data / oracle / scale / workflow / model training cost / query & feedback types / performance metric / reliability / privacy / others
The details and the full list could see here.
AL has already been used in many real-world applications. For some reasons, the implementations in many companies are confidential. But we can still find many applications from several published papers and websites.
Basically, there are two types of applications: scientific applications & industrial applications.
Name | Languages | Author | Notes |
---|---|---|---|
AL playground | Python(scikit-learn, keras) | Abandoned | |
modAL | Python(scikit-learn) | Tivadar Danka | Keep updating |
libact | Python(scikit-learn) | NTU(Hsuan-Tien Lin group) | |
ALiPy | Python(scikit-learn) | NUAA(Shengjun Huang) | Include MLAL |
pytorch_active_learning | Python(pytorch) | Robert Monarch | Keep updating & include active transfer learning |
DeepAL | Python(scikit-learn, pytorch) | Kuan-Hao Huang | Keep updating & deep neural networks |
BaaL | Python(scikit-learn, pytorch) | ElementAI | Keep updating & bayesian active learning |
lrtc | Python(scikit-learn, tensorflow) | IBM | Text classification |
Small-text | Python(scikit-learn, pytorch) | Christopher Schröder | Text classification |
DeepCore | Python(scikit-learn, pytorch) | Guo et al. | In the coreset selection formulation |
PyRelationAL: A Library for Active Learning Research and Development | Python(scikit-learn, pytorch) | Scherer et al. | |
DeepAL+ | Python(scikit-learn, pytorch) | Zhan | An extension for DeepAL |
ALaaS | Python(scikit-learn) | A*STAR & NTU | Use the stage-level parallellism for AL. |
We also list several scholars who are currently heavily contributing to this research direction.
Several young researchers who provides valuable insights for AL: