alpbench.pipeline.QueryStrategy¶

Classes

ActiveMLEnsembleQueryStrategy(seed, qs, ...)

This class is an abstract class for active learning query strategies that are ensemble-based.

ActiveMLModelBasedQueryStrategy(seed, qs)

This class is an abstract class for active learning query strategies that are model-based.

ActiveMLQueryStrategy(seed, qs)

This class is an abstract class for active learning query strategies.

BALDQueryStrategy(seed, ensemble_size)

This class is used to sample instances from the pool of unlabeled instances based on the BALD method.

BatchBaldQueryStrategy(seed, ensemble_size)

This class is used to sample instances from the pool of unlabeled instances based on the BatchBALD method.

ClusterMarginQueryStrategy(seed)

ClusterMargin

CoreSetQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on the core set method.

DiscriminativeQueryStrategy(seed)

EmbeddingBasedQueryStrategy(seed)

This class is an abstract class for query strategies that are based on embeddings.

EnsemblePseudoRandomizedQueryStrategy(seed, ...)

EntropyQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on entropy.

EpistemicUncertaintyQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on epistemic uncertainty.

ExpectedAveragePrecision(seed)

This class is used to sample instances from the pool of unlabeled instances based on the expected average precision.

FalcunQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on the FALCUN method.

KMeansQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on the KMeans method.

LeastConfidentQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on least confidence.

MarginQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on the margin method.

MaxEntropyQueryStrategy(seed, ensemble_size)

This class is used to sample instances from the pool of unlabeled instances based on the maximum entropy method.

MinMarginQueryStrategy(seed, ensemble_size)

This class is used to sample instances from the pool of unlabeled instances based on the minimum margin method.

MonteCarloEERLogLoss(seed)

This class is used to sample instances from the pool of unlabeled instances based on the Monte Carlo EER method with the log loss method.

MonteCarloEERMisclassification(seed)

This class is used to sample instances from the pool of unlabeled instances based on the Monte Carlo EER method with the misclassification loss method.

MonteCarloEERStrategy(seed, method)

This class is used to sample instances from the pool of unlabeled instances based on the Monte Carlo EER method.

PowerBALDQueryStrategy(seed, ensemble_size)

This class is used to sample instances from the pool of unlabeled instances based on a power version of BALD.

PowerMarginQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on the power margin method.

PseudoRandomizedQueryStrategy(seed)

This class is an abstract class for query strategies that are pseudo-randomized, meaning that they can be reproduced with the same random seed.

QBCVarianceRatioQueryStrategy(seed, ...)

This class is used to sample instances from the pool of unlabeled instances based on QBC method with variance ratio as measure of ensemble disagreement.

QueryByCommitteeEntropyQueryStrategy(seed, ...)

This class is used to sample instances from the pool of unlabeled instances based on the Query by Committee method with entropy as measure of ensemble disagreement.

QueryByCommitteeKLQueryStrategy(seed, ...)

This class is used to sample instances from the pool of unlabeled instances based on the Query by Committee method with KL-divergence as measure of ensemble disagreement.

QueryByCommitteeQueryStrategy(seed, method, ...)

This class is used to sample instances from the pool of unlabeled instances based on the Query by Committee method.

QueryStrategy()

This class is an abstract class for query strategies.

RandomMarginQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on the random margin method.

RandomQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances randomly.

TypicalClusterQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on the typicality of the instances in the clusters.

UncertaintyQueryStrategy(seed, method)

This class is used to sample instances from the pool of unlabeled instances based on uncertainty.

WeightedClusterQueryStrategy(seed)

This class is used to sample instances from the pool of unlabeled instances based on the weighted cluster method.

WrappedQueryStrategy(wrapped_query_strategy, ...)

This class is used to wrap a query strategy with a learner.

class alpbench.pipeline.QueryStrategy.ActiveMLEnsembleQueryStrategy(seed, qs, ensemble_size)[source]¶

Bases: ActiveMLQueryStrategy

This class is an abstract class for active learning query strategies that are ensemble-based. The query strategies are used to sample instances from the pool of unlabeled instances.

Parameters:
  • seed (int) – The seed for the random number generator.

  • qs – object

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

qs¶

object

ensemble_size¶

The size of the ensemble.

Type:

int

get_params()[source]¶
sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.ActiveMLModelBasedQueryStrategy(seed, qs)[source]¶

Bases: ActiveMLQueryStrategy

This class is an abstract class for active learning query strategies that are model-based. The query strategies are used to sample instances from the pool of unlabeled instances.

Parameters:
  • seed (int) – The seed for the random number generator.

  • qs – object

seed¶

The seed for the random number generator.

Type:

int

qs¶

object

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.ActiveMLQueryStrategy(seed, qs)[source]¶

Bases: PseudoRandomizedQueryStrategy

This class is an abstract class for active learning query strategies. The query strategies are used to sample instances from the pool of unlabeled instances.

Parameters:
  • seed (int) – The seed for the random number generator.

  • qs – object

seed¶

The seed for the random number generator.

Type:

int

qs¶

object

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.BALDQueryStrategy(seed, ensemble_size)[source]¶

Bases: EnsemblePseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the BALD method.

Parameters:
  • seed (int) – The seed for the random number generator.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

ensemble_size¶

The size of the ensemble.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.BatchBaldQueryStrategy(seed, ensemble_size)[source]¶

Bases: ActiveMLEnsembleQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the BatchBALD method.

Parameters:
  • seed (int) – The seed for the random number generator.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

ensemble_size¶

The size of the ensemble.

Type:

int

class alpbench.pipeline.QueryStrategy.ClusterMarginQueryStrategy(seed)[source]¶

Bases: EmbeddingBasedQueryStrategy

ClusterMargin

This class is used to sample instances from the pool of unlabeled instances based on the cluster margin method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.CoreSetQueryStrategy(seed)[source]¶

Bases: EmbeddingBasedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the core set method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.DiscriminativeQueryStrategy(seed)[source]¶

Bases: ActiveMLQueryStrategy

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.EmbeddingBasedQueryStrategy(seed)[source]¶

Bases: PseudoRandomizedQueryStrategy

This class is an abstract class for query strategies that are based on embeddings. The query strategies are used to sample instances from the pool of unlabeled instances.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

compute_embedding(learner, X_l, y_l, X_u, transform_labeled=False)[source]¶
abstract sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.EnsemblePseudoRandomizedQueryStrategy(seed, ensemble_size)[source]¶

Bases: PseudoRandomizedQueryStrategy

get_params()[source]¶
abstract sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.EntropyQueryStrategy(seed)[source]¶

Bases: PseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on entropy.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.EpistemicUncertaintyQueryStrategy(seed)[source]¶

Bases: ActiveMLModelBasedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on epistemic uncertainty.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

class alpbench.pipeline.QueryStrategy.ExpectedAveragePrecision(seed)[source]¶

Bases: UncertaintyQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the expected average precision.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

class alpbench.pipeline.QueryStrategy.FalcunQueryStrategy(seed)[source]¶

Bases: PseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the FALCUN method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.KMeansQueryStrategy(seed)[source]¶

Bases: EmbeddingBasedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the KMeans method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.LeastConfidentQueryStrategy(seed)[source]¶

Bases: PseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on least confidence.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.MarginQueryStrategy(seed)[source]¶

Bases: PseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the margin method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.MaxEntropyQueryStrategy(seed, ensemble_size)[source]¶

Bases: EnsemblePseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the maximum entropy method.

Parameters:
  • seed (int) – The seed for the random number generator.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

ensemble_size¶

The size of the ensemble.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.MinMarginQueryStrategy(seed, ensemble_size)[source]¶

Bases: EnsemblePseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the minimum margin method.

Parameters:
  • seed (int) – The seed for the random number generator.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

ensemble_size¶

The size of the ensemble.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.MonteCarloEERLogLoss(seed)[source]¶

Bases: MonteCarloEERStrategy

This class is used to sample instances from the pool of unlabeled instances based on the Monte Carlo EER method with the log loss method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

class alpbench.pipeline.QueryStrategy.MonteCarloEERMisclassification(seed)[source]¶

Bases: MonteCarloEERStrategy

This class is used to sample instances from the pool of unlabeled instances based on the Monte Carlo EER method with the misclassification loss method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

class alpbench.pipeline.QueryStrategy.MonteCarloEERStrategy(seed, method)[source]¶

Bases: ActiveMLModelBasedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the Monte Carlo EER method.

Parameters:
  • seed (int) – The seed for the random number generator.

  • method (str) – The method for Monte Carlo EER.

seed¶

The seed for the random number generator.

Type:

int

qs¶

object

class alpbench.pipeline.QueryStrategy.PowerBALDQueryStrategy(seed, ensemble_size)[source]¶

Bases: EnsemblePseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on a power version of BALD.

Parameters:
  • seed (int) – The seed for the random number generator.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

ensemble_size¶

The size of the ensemble.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.PowerMarginQueryStrategy(seed)[source]¶

Bases: PseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the power margin method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.PseudoRandomizedQueryStrategy(seed)[source]¶

Bases: QueryStrategy

This class is an abstract class for query strategies that are pseudo-randomized, meaning that they can be reproduced with the same random seed.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

get_params()[source]¶
abstract sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.QBCVarianceRatioQueryStrategy(seed, ensemble_size)[source]¶

Bases: EnsemblePseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on QBC method with variance ratio as measure of ensemble disagreement.

Parameters:
  • seed (int) – The seed for the random number generator.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

ensemble_size¶

The size of the ensemble.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.QueryByCommitteeEntropyQueryStrategy(seed, ensemble_size)[source]¶

Bases: QueryByCommitteeQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the Query by Committee method with entropy as measure of ensemble disagreement.

Parameters:
  • seed (int) – The seed for the random number generator.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

ensemble_size¶

The size of the ensemble.

Type:

int

class alpbench.pipeline.QueryStrategy.QueryByCommitteeKLQueryStrategy(seed, ensemble_size)[source]¶

Bases: QueryByCommitteeQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the Query by Committee method with KL-divergence as measure of ensemble disagreement.

Parameters:
  • seed (int) – The seed for the random number generator.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

ensemble_size¶

The size of the ensemble.

Type:

int

class alpbench.pipeline.QueryStrategy.QueryByCommitteeQueryStrategy(seed, method, ensemble_size)[source]¶

Bases: ActiveMLEnsembleQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the Query by Committee method.

Parameters:
  • seed (int) – The seed for the random number generator.

  • method (str) – The method for Query by Committee.

  • ensemble_size (int) – The size of the ensemble.

seed¶

The seed for the random number generator.

Type:

int

qs¶

object

ensemble_size¶

The size of the ensemble.

Type:

int

class alpbench.pipeline.QueryStrategy.QueryStrategy[source]¶

Bases: ABC

This class is an abstract class for query strategies. The query strategies are used to sample instances from the pool of unlabeled instances.

abstract get_params()[source]¶

This method returns the parameters of the query strategy.

abstract sample(learner, X_l, y_l, X_u, num_queries)[source]¶

This method samples instances from the pool of unlabeled instances. It is given a learner, that is already fitted on the labeled data and potentially used to predict probabilities for the unlabeled data.

Parameters:
  • learner – object

  • X_l – np.ndarray

  • y_l – np.ndarray

  • X_u – np.ndarray

  • num_queries – int

class alpbench.pipeline.QueryStrategy.RandomMarginQueryStrategy(seed)[source]¶

Bases: PseudoRandomizedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the random margin method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.RandomQueryStrategy(seed)[source]¶

Bases: ActiveMLQueryStrategy

This class is used to sample instances from the pool of unlabeled instances randomly.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

class alpbench.pipeline.QueryStrategy.TypicalClusterQueryStrategy(seed)[source]¶

Bases: EmbeddingBasedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the typicality of the instances in the clusters.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.UncertaintyQueryStrategy(seed, method)[source]¶

Bases: ActiveMLQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on uncertainty.

Parameters:
  • seed (int) – The seed for the random number generator.

  • method (str) – The method for uncertainty sampling.

seed¶

The seed for the random number generator.

Type:

int

qs¶

object

class alpbench.pipeline.QueryStrategy.WeightedClusterQueryStrategy(seed)[source]¶

Bases: EmbeddingBasedQueryStrategy

This class is used to sample instances from the pool of unlabeled instances based on the weighted cluster method.

Parameters:

seed (int) – The seed for the random number generator.

seed¶

The seed for the random number generator.

Type:

int

sample(learner, X_l, y_l, X_u, num_queries)[source]¶
class alpbench.pipeline.QueryStrategy.WrappedQueryStrategy(wrapped_query_strategy, learner)[source]¶

Bases: QueryStrategy

This class is used to wrap a query strategy with a learner. The wrapped query strategy is used to sample instances from the pool of unlabeled instances.

Parameters:
  • wrapped_query_strategy (QueryStrategy) – object

  • learner – object

wrapped_query_strategy¶

object

learner¶

object

get_params()[source]¶
sample(learner, X_l, y_l, X_u, num_queries)[source]¶