alpbench.util.pytorch_tabnet.multiclass_utils¶

Multi-class / multi-label utility function¶

Functions

`assert_all_finite`(X[, allow_nan])	Throw a ValueError if X contains NaN or infinity.
`check_classification_targets`(y)	Ensure that target y is of a non-regression type.
`check_output_dim`(labels, y)
`check_unique_type`(y)
`infer_multitask_output`(y_train)	Infer output_dim from targets This is for multiple tasks.
`infer_output_dim`(y_train)	Infer output_dim from targets
`is_multilabel`(y)	Check if `y` is in a multilabel format.
`type_of_target`(y)	Determine the type of data indicated by the target.
`unique_labels`(*ys)	Extract an ordered array of unique labels

alpbench.util.pytorch_tabnet.multiclass_utils.assert_all_finite(X, allow_nan=False)[source]¶

Throw a ValueError if X contains NaN or infinity.

Parameters:

X (array or sparse matrix) –
allow_nan (bool) –

alpbench.util.pytorch_tabnet.multiclass_utils.check_classification_targets(y)[source]¶

Ensure that target y is of a non-regression type.

Only the following target types (as defined in type_of_target) are allowed:: ‘binary’, ‘multiclass’, ‘multiclass-multioutput’, ‘multilabel-indicator’, ‘multilabel-sequences’

Parameters:: y (array-like) –

alpbench.util.pytorch_tabnet.multiclass_utils.check_output_dim(labels, y)[source]¶

alpbench.util.pytorch_tabnet.multiclass_utils.check_unique_type(y)[source]¶

alpbench.util.pytorch_tabnet.multiclass_utils.infer_multitask_output(y_train)[source]¶

Infer output_dim from targets This is for multiple tasks.

Parameters:

y_train (np.ndarray) – Training targets

Returns:

tasks_dims (list) – Number of classes for output
tasks_labels (list) – List of sorted list of initial classes

alpbench.util.pytorch_tabnet.multiclass_utils.infer_output_dim(y_train)[source]¶

Infer output_dim from targets

Parameters:

y_train (np.array) – Training targets

Returns:

output_dim (int) – Number of classes for output
train_labels (list) – Sorted list of initial classes

alpbench.util.pytorch_tabnet.multiclass_utils.is_multilabel(y)[source]¶

Check if y is in a multilabel format.

Parameters:: y (numpy array of shape [n_samples]) – Target values.
Returns:: out – Return True, if y is in a multilabel format, else `False.
Return type:: bool

Examples

>>> import numpy as np
>>> from sklearn.utils.multiclass import is_multilabel
>>> is_multilabel([0, 1, 0, 1])
False
>>> is_multilabel([[1], [0, 2], []])
False
>>> is_multilabel(np.array([[1, 0], [0, 0]]))
True
>>> is_multilabel(np.array([[1], [0], [0]]))
False
>>> is_multilabel(np.array([[1, 0, 0]]))
True

alpbench.util.pytorch_tabnet.multiclass_utils.type_of_target(y)[source]¶

Determine the type of data indicated by the target.

Note that this type is the most specific type that can be inferred. For example:

binary is more specific but compatible with multiclass.

multiclass of integers is more specific but compatible with continuous.

multilabel-indicator is more specific but compatible with multiclass-multioutput.

Parameters:

y (array-like) –

Returns:

target_type – One of:

’continuous’: y is an array-like of floats that are not all integers, and is 1d or a column vector.
’continuous-multioutput’: y is a 2d array of floats that are not all integers, and both dimensions are of size > 1.
’binary’: y contains <= 2 discrete values and is 1d or a column vector.
’multiclass’: y contains more than two discrete values, is not a sequence of sequences, and is 1d or a column vector.
’multiclass-multioutput’: y is a 2d array that contains more than two discrete values, is not a sequence of sequences, and both dimensions are of size > 1.
’multilabel-indicator’: y is a label indicator matrix, an array of two dimensions with at least two columns, and at most 2 unique values.
’unknown’: y is array-like but none of the above, such as a 3d array, sequence of sequences, or an array of non-sequence objects.

Return type:

string

Examples

>>> import numpy as np
>>> type_of_target([0.1, 0.6])
'continuous'
>>> type_of_target([1, -1, -1, 1])
'binary'
>>> type_of_target(['a', 'b', 'a'])
'binary'
>>> type_of_target([1.0, 2.0])
'binary'
>>> type_of_target([1, 0, 2])
'multiclass'
>>> type_of_target([1.0, 0.0, 3.0])
'multiclass'
>>> type_of_target(['a', 'b', 'c'])
'multiclass'
>>> type_of_target(np.array([[1, 2], [3, 1]]))
'multiclass-multioutput'
>>> type_of_target([[1, 2]])
'multiclass-multioutput'
>>> type_of_target(np.array([[1.5, 2.0], [3.0, 1.6]]))
'continuous-multioutput'
>>> type_of_target(np.array([[0, 1], [1, 1]]))
'multilabel-indicator'

alpbench.util.pytorch_tabnet.multiclass_utils.unique_labels(*ys)[source]¶

Extract an ordered array of unique labels

We don’t allow:

mix of multilabel and multiclass (single label) targets
mix of label indicator matrix and anything else, because there are no explicit labels)
mix of label indicator matrices of different sizes
mix of string and integer labels

At the moment, we also don’t allow “multiclass-multioutput” input type.

Parameters:: *ys (array-likes) –
Returns:: out – An ordered array of unique labels.
Return type:: numpy array of shape [n_unique_labels]

Examples

>>> from sklearn.utils.multiclass import unique_labels
>>> unique_labels([3, 5, 5, 5, 7, 7])
array([3, 5, 7])
>>> unique_labels([1, 2, 3, 4], [2, 2, 3, 4])
array([1, 2, 3, 4])
>>> unique_labels([1, 2, 10], [5, 11])
array([ 1,  2,  5, 10, 11])