libact.base package¶
Submodules¶
libact.base.dataset module¶
The dataset class used in this package. Datasets consists of data used for training, represented by a list of (feature, label) tuples. May be exported in different formats for application on other libraries.
-
class
libact.base.dataset.
Dataset
(X=None, y=None)¶ Bases:
object
libact dataset object
Parameters: - X ({array-like}, shape = (n_samples, n_features)) – Feature of sample set.
- y (list of {int, None}, shape = (n_samples)) – The ground truth (label) for corresponding sample. Unlabeled data should be given a label None.
-
data
¶ list, shape = (n_samples) – List of all sample feature and label tuple.
-
append
(feature, label=None)¶ Add a (feature, label) entry into the dataset. A None label indicates an unlabeled entry.
Parameters: - feature ({array-like}, shape = (n_features)) – Feature of the sample to append to dataset.
- label ({int, None}) – Label of the sample to append to dataset. None if unlabeled.
Returns: entry_id – entry_id for the appened sample.
Return type: {int}
-
format_sklearn
()¶ Returns dataset in (X, y) format for use in scikit-learn. Unlabeled entries are ignored.
Returns: - X (numpy array, shape = (n_samples, n_features)) – Sample feature set.
- y (numpy array, shape = (n_samples)) – Sample labels.
-
get_entries
()¶ Return the list of all sample feature and ground truth tuple.
Returns: data – List of all sample feature and label tuple. Return type: list, shape = (n_samples)
-
get_labeled_entries
()¶ Returns list of labeled feature and their label
Returns: labeled_entries – Labeled entries Return type: list of (feature, label) tuple
-
get_unlabeled_entries
()¶ Returns list of unlabeled features, along with their entry_ids
Returns: unlabeled_entries – Labeled entries Return type: list of (entry_id, feature) tuple
-
labeled_uniform_sample
(sample_size, replace=True)¶ Returns a Dataset object with labeled data only, which is resampled uniformly with given sample size. Parameter replace decides whether sampling with replacement or not.
Parameters: sample_size –
-
on_update
(callback)¶ Add callback function to call when dataset updated.
Parameters: callback (callable) – The function to be called when dataset is updated.
-
libact.base.dataset.
import_libsvm_sparse
(filename)¶ Imports dataset file in libsvm sparse format
-
libact.base.dataset.
import_scipy_mat
(filename)¶
libact.base.interfaces module¶
Base interfaces for use in the package. The package works according to the interfaces defined below.
-
class
libact.base.interfaces.
ContinuousModel
¶ Bases:
libact.base.interfaces.Model
Classification Model with intermediate continuous output
A continuous classification model is able to output a real-valued vector for each features provided.
-
predict_real
(feature, *args, **kwargs)¶ Predict confidence scores for samples.
Returns the confidence score for each (sample, class) combination.
The larger the value for entry (sample=x, class=k) is, the more confident the model is about the sample x belonging to the class k.
Take Logistic Regression as example, the return value is the signed distance of that sample to the hyperplane.
Parameters: feature (array-like, shape (n_samples, n_features)) – The samples whose confidence scores are to be predicted. Returns: X – Each entry is the confidence scores per (sample, class) combination. Return type: array-like, shape (n_samples, n_classes)
-
-
class
libact.base.interfaces.
Labeler
¶ Bases:
object
Label the queries made by QueryStrategies
Assign labels to the samples queried by QueryStrategies.
-
class
libact.base.interfaces.
Model
¶ Bases:
object
Classification Model
A Model returns a class-predicting function for future samples after trained on a training dataset.
-
predict
(feature, *args, **kwargs)¶ Predict the class labels for the input samples
Parameters: feature (array-like, shape (n_samples, n_features)) – The unlabeled samples whose labels are to be predicted. Returns: y_pred – The class labels for samples in the feature array. Return type: array-like, shape (n_samples,)
-
-
class
libact.base.interfaces.
MultilabelModel
¶ Bases:
libact.base.interfaces.Model
Multilabel Classification Model
A Model returns a multilabel-predicting function for future samples after trained on a training dataset.
-
class
libact.base.interfaces.
ProbabilisticModel
¶ Bases:
libact.base.interfaces.ContinuousModel
Classification Model with probability output
A probabilistic classification model is able to output a real-valued vector for each features provided.
-
predict_proba
(feature, *args, **kwargs)¶ Predict probability estimate for samples.
Parameters: feature (array-like, shape (n_samples, n_features)) – The samples whose probability estimation are to be predicted. Returns: X – Each entry is the prabablity estimate for each class. Return type: array-like, shape (n_samples, n_classes)
-
predict_real
(feature, *args, **kwargs)¶ Predict confidence scores for samples.
Returns the confidence score for each (sample, class) combination.
The larger the value for entry (sample=x, class=k) is, the more confident the model is about the sample x belonging to the class k.
Take Logistic Regression as example, the return value is the signed distance of that sample to the hyperplane.
Parameters: feature (array-like, shape (n_samples, n_features)) – The samples whose confidence scores are to be predicted. Returns: X – Each entry is the confidence scores per (sample, class) combination. Return type: array-like, shape (n_samples, n_classes)
-
-
class
libact.base.interfaces.
QueryStrategy
(dataset, **kwargs)¶ Bases:
object
Pool-based query strategy
A QueryStrategy advices on which unlabeled data to be queried next given a pool of labeled and unlabeled data.
-
dataset
¶ The Dataset object that is associated with this QueryStrategy.
-