Skweak Versions Save

skweak: A software toolkit for weak supervision applied to NLP tasks

0.3.1

2 years ago

Brand new version of skweak, including both a number of bug fixes and some new functionalities:

  1. skweak is now using the latest version of hmmlearn, thereby fixing a number errors due to a mismatch between method names
  2. We now have a clearer split between aggregation models for sequence labelling and for text classification. Possible aggregators for sequence labelling are SequentialMajorityVoter and HMM (preferred), while the aggregators for non-sequential text classification are MajorityVoter and NaiveBayes.

We also introduce a brand new functionality: multi-label classification! Instead of assuming that all labels are mutually exclusive, you can now aggregate the results of labelling functions without assuming that only one label is correct. This multi-label scheme is available for both sequence labelling (see MultilabelSequentialMajorityVoter and MultilabelHMM) and text classification (see MultilabelMajorityVoter and MultilabelNaiveBayes).

By default, all labels can be simultaneously true for a given data point, but you can enforce exclusivity relations between labels through the method set_exclusive_labels. If all labels are set to be mutually exclusive, the aggregation is equivalent to a standard multi-class setup. Internally, this functionality is implemented by constructing and fitting separate aggregation models for each label.

The code for the aggregation models has also been heavily refactored, making it hopefully easier to create new aggregation models.

0.2.8

3 years ago

First official release of skweak, with support for both sequence labelling and text classification! See the documentation for details.