skbel.learning

This package contains the classes for the learning process of the SKBEL algorithm.

skbel.learning.bel

Bayesian Evidential Learning Framework.

Currently, the common practice is to first transform predictor and target variables through PCA, and then apply CCA.

Alternative blueprints could be written in the same style as the BEL class implementing the classic scheme.

class skbel.learning.bel.BEL(mode: str = 'tm', copy: bool = True, *, X_pre_processing=None, Y_pre_processing=None, X_post_processing=None, Y_post_processing=None, regression_model=None, n_comp_cca=None, x_dim=None, y_dim=None, random_state=None)[source]

Bases: TransformerMixin, MultiOutputMixin, BaseEstimator

Heart of the framework. Inherits from scikit-learn base classes. BEL stands for Bayesian Evidential Learning.

property X_shape

Predictor original shape.

property Y_shape

Target original shape.

__init__(mode: str = 'tm', copy: bool = True, *, X_pre_processing=None, Y_pre_processing=None, X_post_processing=None, Y_post_processing=None, regression_model=None, n_comp_cca=None, x_dim=None, y_dim=None, random_state=None)[source]

Initialize the BEL class.

Parameters:
  • mode – How to infer the posterior distribution (if CCA is used). “kde”, “mvn” or “tm”.

  • copy – Whether to copy arrays or not (default is True).

  • X_pre_processing – sklearn pipeline for pre-processing the predictor.

  • Y_pre_processing – sklearn pipeline for pre-processing the target.

  • X_post_processing – sklearn pipeline for post-processing the predictor.

  • X_post_processing – sklearn pipeline for post-processing the target.

  • regression_model – The regression model to use. Default is Canonical Correlation Analysis.

  • n_comp_cca – Number of components to keep in CCA (only if CCA is used).

  • x_dim – Predictor original dimensions.

  • y_dim – Target original dimensions.

  • random_state – Seed to reproduce the same samples.

_sklearn_auto_wrap_output_keys = {'transform'}
cca_pc_transform(X=None, Y=None) -> (<built-in function array>, <built-in function array>)[source]

Transform PCs to CVs.

Parameters:
  • X – Predictor array.

  • Y – Target array.

Returns:

CVs

fit(X: array = None, Y: array = None)[source]

Fit all pipelines.

Parameters:
  • X – Predictor array.

  • Y – Target array.

Returns:

self

fit_transform(X, y=None, **fit_params)[source]

Fit-Transform across all pipelines.

Parameters:
  • X – Predictor array.

  • y – Target array.

Returns:

If mode == “mvn” - returns the posterior mean and covariance. If mode == “kde” - returns a dictionary of functions.

inverse_transform(Y_pred: array, dtype: str = 'float64', get_PC=False) array[source]

Back-transforms the posterior samples Y_pred to their physical space.

Parameters:
  • Y_pred – The posterior samples (shape = (n_obs, n_components, n_samples))

  • get_PC – If True, returns the canonical variates

  • dtype – The dtype of the output array

Returns:

The back-transformed samples

kde_init(X_obs_f: array, obs_n: int = None)[source]

Initialize the KDEs, i.e. the functions that will be used to sample from the posterior distribution.

Parameters:
  • X_obs_f – Observed data points

  • obs_n – Observation number

Returns:

The initialized KDEs

property n_posts

Number of sample to extract from the posterior multivariate distribution after post-processing.

predict(X_obs: array = None, n_posts: int = None, mode: str = None, noise: float = None, return_samples: bool = True, inverse_transform: bool = True, precomputed_kde: array = None, dtype: str = 'float64') array[source]

Predict the posterior distribution of the target variable.

Parameters:
  • X_obs – The observed data.

  • n_posts – The number of posterior samples to draw.

  • mode – The mode of inference to use. Default is “tm”.

  • noise – The noise level of the model (only if mode == ‘mvn’).

  • return_samples – Option to return samples or not. Default=True.

  • inverse_transform – Option to return the samples in the original space. If the dimensionality of the original space is very high, this can be memory-consuming. It can be set to False to return the samples in the reduced space, which is much faster, so that the samples can be back-transformed later. Default=True.

  • precomputed_kde – (if mode=”kde) Precomputed KDE functions. Computing the KDEs can be time-consuming. If the KDEs are precomputed, they can be passed as an argument.

  • dtype – The data type of the samples. Default=float64.

Returns:

The posterior samples in the original space or in the transformed space.

random_sample(X_obs_f: None, obs_n: int = None, n_posts: int = None, mode: str = None, init_kde: array = None) array[source]

Random sample the inferred posterior distribution. It can be used to generate samples from the posterior.

Parameters:
  • X_obs_f – Observed data points in the feature space. Shape = (n_obs, n_comp_CCA)

  • obs_n – If we want to generate samples from the posterior of a specific observation point, obs_n is the index of the observation point.

  • n_posts – Number of posterior samples

  • mode – How to sample the posterior distribution

  • init_kde – Initial KDE function. If None, the KDE function is computed from the observed data.

Returns:

Samples from the posterior distribution (n_obs, n_posts, n_comp_CCA)

property random_state

Seed a.k.a.

random state to reproduce the same samples

property seed

Seed a.k.a.

random state to reproduce the same samples

set_inverse_transform_request(*, Y_pred: bool | None | str = '$UNCHANGED$', dtype: bool | None | str = '$UNCHANGED$', get_PC: bool | None | str = '$UNCHANGED$') BEL

Configure whether metadata should be requested to be passed to the inverse_transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to inverse_transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to inverse_transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters

Y_predstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for Y_pred parameter in inverse_transform.

dtypestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for dtype parameter in inverse_transform.

get_PCstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for get_PC parameter in inverse_transform.

Returns

selfobject

The updated object.

set_predict_request(*, X_obs: bool | None | str = '$UNCHANGED$', dtype: bool | None | str = '$UNCHANGED$', inverse_transform: bool | None | str = '$UNCHANGED$', mode: bool | None | str = '$UNCHANGED$', n_posts: bool | None | str = '$UNCHANGED$', noise: bool | None | str = '$UNCHANGED$', precomputed_kde: bool | None | str = '$UNCHANGED$', return_samples: bool | None | str = '$UNCHANGED$') BEL

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters

X_obsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for X_obs parameter in predict.

dtypestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for dtype parameter in predict.

inverse_transformstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for inverse_transform parameter in predict.

modestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for mode parameter in predict.

n_postsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for n_posts parameter in predict.

noisestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for noise parameter in predict.

precomputed_kdestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for precomputed_kde parameter in predict.

return_samplesstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for return_samples parameter in predict.

Returns

selfobject

The updated object.

transform(X=None, Y=None) -> (<built-in function array>, <built-in function array>)[source]

Transform data across all pipelines.

Parameters:
  • X – Predictor array.

  • Y – Target array.

Returns:

Post-processed variables

property x_observation

Pre-processed observation.

property x_pre_processed

Pre-processed predictor.

property y_pre_processed

Pre-processed target.