libact.query_strategies.multilabel package¶

Submodules¶

libact.query_strategies.multilabel.adaptive_active_learning module¶

Adaptive active learning

class libact.query_strategies.multilabel.adaptive_active_learning.AdaptiveActiveLearning(dataset, base_clf, betas=None, n_jobs=1, random_state=None)¶

Bases: libact.base.interfaces.QueryStrategy

Adaptive Active Learning

This approach combines Max Margin Uncertainty Sampling and Label Cardinality Inconsistency.

Parameters:

base_clf (ContinuousModel object instance) – The base learner for binary relavance.
betas (list of float, 0 <= beta <= 1, default: [0., 0.1, .., 0.9, 1.]) – List of trade-off parameter that balances the relative importance degrees of the two measures.
random_state ({int, np.random.RandomState instance, None}, optional (default=None)) – If int or None, random_state is passed as parameter to generate np.random.RandomState instance. if np.random.RandomState instance, random_state is the random number generate.
n_jobs (int, optional, default: 1) – The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.

Examples

Here is an example of declaring a MMC query_strategy object:

from libact.query_strategies.multilabel import AdaptiveActiveLearning
from sklearn.linear_model import LogisticRegression

qs = AdaptiveActiveLearning(
    dataset, # Dataset object
    base_clf=LogisticRegression()
)

References

[1]	Li, Xin, and Yuhong Guo. “Active Learning with Multi-Label SVM Classification.” IJCAI. 2013.

make_query()¶

Return the index of the sample to be queried and labeled. Read-only.

No modification to the internal states.

Returns:	ask_id – The index of the next unlabeled sample to be queried and labeled.
Return type:	int

libact.query_strategies.multilabel.binary_minimization module¶

Binary Minimization

class libact.query_strategies.multilabel.binary_minimization.BinaryMinimization(dataset, base_clf, random_state=None)¶

Bases: libact.base.interfaces.QueryStrategy

Binary Version Space Minimization (BinMin)

Parameters:	base_clf (ContinuousModel object instance) – The base learner for binary relavance. random_state ({int, np.random.RandomState instance, None}, optional (default=None)) – If int or None, random_state is passed as parameter to generate np.random.RandomState instance. if np.random.RandomState instance, random_state is the random number generate.

Examples

Here is an example of declaring a BinaryMinimization query_strategy object:

from libact.query_strategies.multilabel import BinaryMinimization
from sklearn.linear_model import LogisticRegression

qs = BinaryMinimization(
         dataset, # Dataset object
         br_base=LogisticRegression()
     )

References

[1]	Brinker, Klaus. “On active learning in multi-label classification.” From Data and Information Analysis to Knowledge Engineering. Springer Berlin Heidelberg, 2006. 206-213.

make_query()¶

Return the index of the sample to be queried and labeled. Read-only.

No modification to the internal states.

Returns:	ask_id – The index of the next unlabeled sample to be queried and labeled.
Return type:	int

libact.query_strategies.multilabel.maximum_margin_reduction module¶

Maximum loss reduction with Maximal Confidence (MMC)

class libact.query_strategies.multilabel.maximum_margin_reduction.MaximumLossReductionMaximalConfidence(*args, **kwargs)¶

Bases: libact.base.interfaces.QueryStrategy

Maximum loss reduction with Maximal Confidence (MMC)

This algorithm is designed to use binary relavance with SVM as base model.

Parameters:

base_learner (libact.query_strategies object instance) – The base learner for binary relavance, should support predict_proba
br_base (ProbabilisticModel object instance) – The base learner for the binary relevance in MMC. Should support predict_proba.
logreg_param (dict, optional (default={})) – Setting the parameter for the logistic regression that are used to predict the number of labels for a given feature vector. Parameter detail please refer to: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
random_state ({int, np.random.RandomState instance, None}, optional (default=None)) – If int or None, random_state is passed as parameter to generate np.random.RandomState instance. if np.random.RandomState instance, random_state is the random number generate.

logistic_regression_¶: libact.models.LogisticRegression object instance – The model used to predict the number of label in each instance. Should support multi-class classification.

Examples

Here is an example of declaring a MMC query_strategy object:

from libact.query_strategies.multilabel import MMC
from sklearn.linear_model import LogisticRegression

qs = MMC(
         dataset, # Dataset object
         br_base=LogisticRegression()
     )

References

[1]	Yang, Bishan, et al. “Effective multi-label active learning for text classification.” Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009.

make_query()¶

Return the index of the sample to be queried and labeled. Read-only.

No modification to the internal states.

Returns:	ask_id – The index of the next unlabeled sample to be queried and labeled.
Return type:	int

libact.query_strategies.multilabel.multilabel_with_auxiliary_learner module¶

Multi-label Active Learning with Auxiliary Learner

class libact.query_strategies.multilabel.multilabel_with_auxiliary_learner.MultilabelWithAuxiliaryLearner(dataset, major_learner, auxiliary_learner, criterion='hlr', b=1.0, random_state=None)¶

Bases: libact.base.interfaces.QueryStrategy

Multi-label Active Learning with Auxiliary Learner

Parameters:

major_learner (libact.base.interfaces.Model object instance) – The major multilabel learner. This learner should be the model to be used to solve the problem.
auxiliary_learner (libact.models.multilabel object instance) – The auxiliary multilabel learner. For criterion ‘shlr’ and ‘mmr’, it is required to support predict_real or predict_proba.
criterion (['hlr', 'shlr', 'mmr'], optional(default='hlr')) – The criterion for estimating the difference between major_learner and auxiliary_learner. hlr, hamming loss reduction shlr, soft hamming loss reduction mmr, maximum margin reduction
b (float) – parameter for criterion shlr. It sets the score to be clipped between [-b, b] to remove influence of extreme margin values.
random_state ({int, np.random.RandomState instance, None}, optional (default=None)) – If int or None, random_state is passed as parameter to generate np.random.RandomState instance. if np.random.RandomState instance, random_state is the random number generate.

Examples

Here is an example of declaring a multilabel with auxiliary learner query_strategy object:

from libact.query_strategies.multilabel import MultilabelWithAuxiliaryLearner
from libact.models.multilabel import BinaryRelevance
from libact.models import LogisticRegression, SVM

qs = MultilabelWithAuxiliaryLearner(
         dataset,
         major_learner=BinaryRelevance(LogisticRegression())
         auxiliary_learner=BinaryRelevance(SVM())
     )

References

[1]	Hung, Chen-Wei, and Hsuan-Tien Lin. “Multi-label Active Learning with Auxiliary Learner.” ACML. 2011.

make_query()¶

Return the index of the sample to be queried and labeled. Read-only.

No modification to the internal states.

Returns:	ask_id – The index of the next unlabeled sample to be queried and labeled.
Return type:	int

Module contents¶

Concrete query strategy classes.