alpbench.util.pytorch_tabnet.abstract_model

Classes

TabModel([n_d, n_a, n_steps, gamma, ...])

Class for TabNet model.

class alpbench.util.pytorch_tabnet.abstract_model.TabModel(n_d=8, n_a=8, n_steps=3, gamma=1.3, cat_idxs=<factory>, cat_dims=<factory>, cat_emb_dim=1, n_independent=2, n_shared=2, epsilon=1e-15, momentum=0.02, lambda_sparse=0.001, seed=0, clip_value=1, verbose=1, optimizer_fn=<class 'torch.optim.adam.Adam'>, optimizer_params=<factory>, scheduler_fn=None, scheduler_params=<factory>, mask_type='sparsemax', input_dim=None, output_dim=None, device_name='auto', n_shared_decoder=1, n_indep_decoder=1, grouped_features=<factory>)[source]

Bases: BaseEstimator

Class for TabNet model.

optimizer_fn

alias of Adam

abstract compute_loss(y_score, y_true)[source]

Compute the loss.

Parameters:
  • y_score (a :tensor: torch.Tensor) – Score matrix

  • y_true (a :tensor: torch.Tensor) – Target matrix

Returns:

Loss value

Return type:

float

explain(X, normalize=False)[source]

Return local explanation

Parameters:
  • X (tensor: torch.Tensor or matrix: scipy.sparse.csr_matrix) – Input data

  • normalize (bool (default False)) – Wheter to normalize so that sum of features are equal to 1

Returns:

  • M_explain (matrix) – Importance per sample, per columns.

  • masks (matrix) – Sparse matrix showing attention masks used by network.

fit(X_train, y_train, eval_set=None, eval_name=None, eval_metric=None, loss_fn=None, weights=0, max_epochs=100, patience=10, batch_size=1024, virtual_batch_size=128, num_workers=0, drop_last=True, callbacks=None, pin_memory=True, from_unsupervised=None, warm_start=False, augmentations=None, compute_importance=True)[source]

Train a neural network stored in self.network Using train_dataloader for training data and valid_dataloader for validation.

Parameters:
  • X_train (np.ndarray) – Train set

  • y_train (np.array) – Train targets

  • eval_set (list of tuple) – List of eval tuple set (X, y). The last one is used for early stopping

  • eval_name (list of str) – List of eval set names.

  • eval_metric (list of str) – List of evaluation metrics. The last metric is used for early stopping.

  • loss_fn (callable or None) – a PyTorch loss function

  • weights (bool or dictionnary) – 0 for no balancing 1 for automated balancing dict for custom weights per class

  • max_epochs (int) – Maximum number of epochs during training

  • patience (int) – Number of consecutive non improving epoch before early stopping

  • batch_size (int) – Training batch size

  • virtual_batch_size (int) – Batch size for Ghost Batch Normalization (virtual_batch_size < batch_size)

  • num_workers (int) – Number of workers used in torch.utils.data.DataLoader

  • drop_last (bool) – Whether to drop last batch during training

  • callbacks (list of callback function) – List of custom callbacks

  • pin_memory (bool) – Whether to set pin_memory to True or False during training

  • from_unsupervised (unsupervised trained model) – Use a previously self supervised model as starting weights

  • warm_start (bool) – If True, current model parameters are used to start training

  • compute_importance (bool) – Whether to compute feature importance

load_class_attrs(class_attrs)[source]
load_model(filepath)[source]

Load TabNet model.

Parameters:

filepath (str) – Path of the model.

load_weights_from_unsupervised(unsupervised_model)[source]
predict(X)[source]

Make predictions on a batch (valid)

Parameters:

X (a :tensor: torch.Tensor or matrix: scipy.sparse.csr_matrix) – Input data

Returns:

predictions – Predictions of the regression problem

Return type:

np.array

abstract prepare_target(y)[source]

Prepare target before training.

Parameters:

y (a :tensor: torch.Tensor) – Target matrix.

Returns:

Converted target matrix.

Return type:

torch.Tensor

save_model(path)[source]

Saving TabNet model in two distinct files.

Parameters:

path (str) – Path of the model.

Returns:

input filepath with “.zip” appended

Return type:

str

abstract update_fit_params(X_train, y_train, eval_set, weights)[source]

Set attributes relative to fit function.

Parameters:
  • X_train (np.ndarray) – Train set

  • y_train (np.array) – Train targets

  • eval_set (list of tuple) – List of eval tuple set (X, y).

  • weights (bool or dictionnary) – 0 for no balancing 1 for automated balancing

cat_dims: list[int]
cat_emb_dim: int = 1
cat_idxs: list[int]
clip_value: int = 1
device_name: str = 'auto'
epsilon: float = 1e-15
gamma: float = 1.3
grouped_features: list[list[int]]
input_dim: int = None
lambda_sparse: float = 0.001
mask_type: str = 'sparsemax'
momentum: float = 0.02
n_a: int = 8
n_d: int = 8
n_indep_decoder: int = 1
n_independent: int = 2
n_shared: int = 2
n_shared_decoder: int = 1
n_steps: int = 3
optimizer_params: dict
output_dim: int = None
scheduler_fn: Any = None
scheduler_params: dict
seed: int = 0
verbose: int = 1