alpbench.util.pytorch_tabnet.abstract_model¶

Classes

TabModel([n_d, n_a, n_steps, gamma, ...])

Class for TabNet model.

class alpbench.util.pytorch_tabnet.abstract_model.TabModel(n_d=8, n_a=8, n_steps=3, gamma=1.3, cat_idxs=<factory>, cat_dims=<factory>, cat_emb_dim=1, n_independent=2, n_shared=2, epsilon=1e-15, momentum=0.02, lambda_sparse=0.001, seed=0, clip_value=1, verbose=1, optimizer_fn=<class 'torch.optim.adam.Adam'>, optimizer_params=<factory>, scheduler_fn=None, scheduler_params=<factory>, mask_type='sparsemax', input_dim=None, output_dim=None, device_name='auto', n_shared_decoder=1, n_indep_decoder=1, grouped_features=<factory>)[source]¶

Bases: BaseEstimator

Class for TabNet model.

optimizer_fn¶: alias of Adam

abstract compute_loss(y_score, y_true)[source]¶

Compute the loss.

Parameters:

y_score (a :tensor: torch.Tensor) – Score matrix
y_true (a :tensor: torch.Tensor) – Target matrix

Returns:

Loss value

Return type:

float

explain(X, normalize=False)[source]¶

Return local explanation

Parameters:

X (tensor: torch.Tensor or matrix: scipy.sparse.csr_matrix) – Input data
normalize (bool (default False)) – Wheter to normalize so that sum of features are equal to 1

Returns:

M_explain (matrix) – Importance per sample, per columns.
masks (matrix) – Sparse matrix showing attention masks used by network.

fit(X_train, y_train, eval_set=None, eval_name=None, eval_metric=None, loss_fn=None, weights=0, max_epochs=100, patience=10, batch_size=1024, virtual_batch_size=128, num_workers=0, drop_last=True, callbacks=None, pin_memory=True, from_unsupervised=None, warm_start=False, augmentations=None, compute_importance=True)[source]¶

Train a neural network stored in self.network Using train_dataloader for training data and valid_dataloader for validation.

Parameters:

X_train (np.ndarray) – Train set
y_train (np.array) – Train targets
eval_set (list of tuple) – List of eval tuple set (X, y). The last one is used for early stopping
eval_name (list of str) – List of eval set names.
eval_metric (list of str) – List of evaluation metrics. The last metric is used for early stopping.
loss_fn (callable or None) – a PyTorch loss function
weights (bool or dictionnary) – 0 for no balancing 1 for automated balancing dict for custom weights per class
max_epochs (int) – Maximum number of epochs during training
patience (int) – Number of consecutive non improving epoch before early stopping
batch_size (int) – Training batch size
virtual_batch_size (int) – Batch size for Ghost Batch Normalization (virtual_batch_size < batch_size)
num_workers (int) – Number of workers used in torch.utils.data.DataLoader
drop_last (bool) – Whether to drop last batch during training
callbacks (list of callback function) – List of custom callbacks
pin_memory (bool) – Whether to set pin_memory to True or False during training
from_unsupervised (unsupervised trained model) – Use a previously self supervised model as starting weights
warm_start (bool) – If True, current model parameters are used to start training
compute_importance (bool) – Whether to compute feature importance

load_class_attrs(class_attrs)[source]¶

load_model(filepath)[source]¶

Load TabNet model.

Parameters:: filepath (str) – Path of the model.

load_weights_from_unsupervised(unsupervised_model)[source]¶

predict(X)[source]¶

Make predictions on a batch (valid)

Parameters:: X (a :tensor: torch.Tensor or matrix: scipy.sparse.csr_matrix) – Input data
Returns:: predictions – Predictions of the regression problem
Return type:: np.array

abstract prepare_target(y)[source]¶

Prepare target before training.

Parameters:: y (a :tensor: torch.Tensor) – Target matrix.
Returns:: Converted target matrix.
Return type:: torch.Tensor

save_model(path)[source]¶

Saving TabNet model in two distinct files.

Parameters:: path (str) – Path of the model.
Returns:: input filepath with “.zip” appended
Return type:: str

abstract update_fit_params(X_train, y_train, eval_set, weights)[source]¶

Set attributes relative to fit function.

Parameters:

X_train (np.ndarray) – Train set
y_train (np.array) – Train targets
eval_set (list of tuple) – List of eval tuple set (X, y).
weights (bool or dictionnary) – 0 for no balancing 1 for automated balancing

cat_dims: list[int]¶

cat_emb_dim: int = 1¶

cat_idxs: list[int]¶

clip_value: int = 1¶

device_name: str = 'auto'¶

epsilon: float = 1e-15¶

gamma: float = 1.3¶

grouped_features: list[list[int]]¶

input_dim: int = None¶

lambda_sparse: float = 0.001¶

mask_type: str = 'sparsemax'¶

momentum: float = 0.02¶

n_a: int = 8¶

n_d: int = 8¶

n_indep_decoder: int = 1¶

n_independent: int = 2¶

n_shared: int = 2¶

n_shared_decoder: int = 1¶

n_steps: int = 3¶

optimizer_params: dict¶

output_dim: int = None¶

scheduler_fn: Any = None¶

scheduler_params: dict¶

seed: int = 0¶

verbose: int = 1¶