pysensors.classification package
Module contents
- class pysensors.classification.SSPOC(basis=None, classifier=None, n_sensors=None, threshold=None, l1_penalty=0.1)[source]
Bases:
BaseEstimatorSparse Sensor Placement Optimization for Classification (SSPOC) object.
As the name suggests, this class can be used to select optimal sensor locations (measurement locations) for classification tasks.
The time complexity of the SSPOC algorithm can be decomposed as
\[C_{total} = C_{basis} + C_{classification} + C_{optimization}\]\(C_{basis}\): the complexity of fitting the selected basis object and producing the matrix inverse. The matrix inverse is “free” to compute for
pysensors.basis.Identityandpysensors.basis.SVD. Forpysensors.basis.RandomProjectionthe complexity is that of callingnumpy.linalg.pinvon a matrix of sizen_input_features * n_basis_modes.\(C_{classification}\): the cost of fitting the chosen classifier to
n_examplesexamples withn_basis_modesfeatures.\(C_{optimization}\): the cost of solving the sensor optimization problem. For binary classification we use
sklearn.linear_model.OrthogonalMatchingPursuit. For multi-class classification we usesklearn.linear_model.MultiTaskLasso. The costs for each depend on the fit options that are specified. In both cases there aren_basis_modesexamples withn_featuresfeatures.
The space complexity likewise depends on the same three factors. Generally, the basis requires
O(n_basis_modes * n_features)space. The space requirements for classification and optimization depend on the particular algorithms being employed. See the Scikit-learn documentation for specifics.See the following reference for more information:
Brunton, Bingni W., et al. “Sparse sensor placement optimization for classification.” SIAM Journal on Applied Mathematics 76.5 (2016): 2099-2122.
- Parameters:
basis (basis object, optional (default
pysensors.basis.Identity)) – Basis in which to represent the data. Default is the identity basis (i.e. raw features).classifier (classifier object, optional ) – (default Linear Discriminant Analysis (LDA)) Classifier for which to optimize sensors. Must be a linear classifier with a
coef_attribute andfitandpredictmethods.n_sensors (positive integer, optional (default None)) – Number of sensor locations to be used after fitting. If
n_sensorsis not None then it overrides thethresholdparameter. If set to 0, thenclassifierwill be replaced with a dummy classifier which predicts the class randomly.threshold (nonnegative float, optional (default None)) –
Threshold for selecting sensors. Overriden by
n_sensors. If boththresholdandn_sensorsare None when thefitmethod is called, then the threshold will be set to\[\frac{\|s\|_F}{2rc}\]where \(s\) is a sensor coefficient matrix, \(r\) is the number of basis modes, and \(c\) is the number of distinct classes, as suggested in Brunton et al. (2016).
l1_penalty (nonnegative float, optional (default 0.1)) – The L1 penalty term used to form the sensor coefficient matrix, s. Larger values will result in a sparser s and fewer selected sensors. This parameter is ignored for binary classification problems.
- Attributes:
n_basis_modes (nonnegative integer) – Number of basis modes to be used when deciding sensor locations.
basis_matrix_inverse_ (np.ndarray, shape (n_basis_modes, n_input_features)) – The inverse of the matrix of basis vectors.
sensor_coef_ (np.ndarray, shape (n_input_features, n_classes)) – The sensor coefficient matrix, s.
sparse_sensors_ (np.ndarray, shape (n_sensors, )) – The selected sensors.
Examples
>>> from sklearn.metrics import accuracy_score >>> from sklearn.datasets import make_classification >>> from pysensors.classification import SSPOC >>> >>> x, y = make_classification(n_classes=3, n_informative=3, random_state=10) >>> >>> model = SSPOC(n_sensors=10, l1_penalty=0.03) >>> model.fit(x, y, quiet=True) SSPOC(basis=Identity(n_basis_modes=100), classifier=LinearDiscriminantAnalysis(), l1_penalty=0.03, n_sensors=10) >>> print(model.selected_sensors) [10 13 6 19 17 16 15 14 12 11] >>> >>> acc = accuracy_score(y, model.predict(x[:, model.selected_sensors])) >>> print("Accuracy:", acc) Accuracy: 0.66 >>> >>> model.update_sensors(n_sensors=5, xy=(x, y), quiet=True) >>> print(model.selected_sensors) [10 13 6 19 17] >>> >>> acc = accuracy_score(y, model.predict(x[:, model.selected_sensors])) >>> print("Accuracy:", acc) Accuracy: 0.6
- fit(x, y, quiet=False, prefit_basis=False, refit=True, **optimizer_kws)[source]
Fit the SSPOC model, determining which sensors are relevant.
- Parameters:
x (array-like, shape (n_samples, n_input_features)) – Training data.
y (array-like, shape (n_samples,)) – Training labels.
quiet (boolean, optional (default False)) – Whether or not to suppress warnings during fitting.
prefit_basis (boolean, optional (default False)) – Whether or not the basis has already been fit to x. For example, you may have already fit and experimented with a
SVDobject to determine the optimal number of modes. This option allows you to avoid an unnecessary SVD.refit (boolean, optional (default True)) – Whether or not to refit the classifier using measurements only from the learned sensor locations.
optimizer_kws (dict, optional) – Keyword arguments to be passed to the optimization routine.
- Returns:
self
- Return type:
a fitted
SSPOCinstance
- predict(x)[source]
Predict classes for given measurements. If
self.n_sensorsis 0 then a dummy classifier is used in place ofself.classifier.- Parameters:
x (array-like, shape (n_samples, n_sensors) or (n_samples, n_features)) – Examples to be classified. The measurements should be taken at the sensor locations specified by
self.selected_sensors.- Returns:
y – Predicted classes.
- Return type:
np.ndarray, shape (n_samples,)
- update_sensors(n_sensors=None, threshold=None, xy=None, quiet=False, method=<function max>, **method_kws)[source]
Update the selected sensors by changing either the preferred number of sensors or the threshold used to select the sensors, refitting the classifier afterwards, if possible.
- Parameters:
n_sensors (nonnegative integer, optional (default None)) – The number of sensor locations to select. If None, then
thresholdwill be used to pick the sensors. Note thatn_sensorsandthresholdcannot both be None.threshold (nonnegative float, optional (default None)) – The threshold to use to select sensors based on the magnitudes of entries in
self.sensor_coef_(s). Overridden byn_sensors. Note thatn_sensorsandthresholdcannot both be None.xy (tuple of np.ndarray, length 2, optional (default None)) – Tuple containing training data x and labels y for refitting. x should have shape (n_samples, n_input_features) and y shape (n_samples, ). If not None, the classifier will be refit after the new sensors have been selected.
quiet (boolean, optional (default False)) – Whether to silence warnings.
method (callable, optional (default
np.max)) – Function used along withthresholdto select sensors. For binary classification problems one need not specify a method. For multiclass classification problems,sensor_coef_(s) has multiple columns andmethodis applied along each row to aggregate coefficients for thresholding, i.e.methodis called as followsmethod(np.abs(self.sensor_coef_), axis=1, **method_kws). Other examples of acceptable methods arenp.min,np.mean, andnp.median.**method_kws (dict, optional) – Keyword arguments to be passed into
methodwhen it is called.
- update_n_basis_modes(n_basis_modes, xy, **fit_kws)[source]
Re-fit the
SSPOCobject using a different value ofn_basis_modes.This method allows one to relearn sensor locations for a different number of basis modes _without_ re-fitting the basis in many cases. Specifically, if
n_basis_modes <= self.basis.n_basis_modesthen the basis does not need to be refit. Otherwise this function does not save any computational resources.- Parameters:
n_basis_modes (positive int, optional (default None)) – Number of basis modes to be used during fit. Must be less than or equal to
n_samples.xy (tuple of np.ndarray, length 2) – Tuple containing training data x and labels y for refitting. x should have shape (n_samples, n_input_features) and y shape (n_samples, ).
**fit_kws (dict, optional) – Keyword arguments to pass to
SSPOC.fit.
- property selected_sensors
Get the indices of the selected sensors.
- Returns:
sensors – Indices of the selected sensors.
- Return type:
numpy array, shape (n_sensors,)
- get_selected_sensors()[source]
Convenience function for getting indices of the selected sensors.
- Returns:
sensors – Indices of the selected sensors.
- Return type:
numpy array, shape (n_sensors,)
- set_fit_request(*, prefit_basis: bool | None | str = '$UNCHANGED$', quiet: bool | None | str = '$UNCHANGED$', refit: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') SSPOC
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
prefit_basis (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
prefit_basisparameter infit.quiet (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
quietparameter infit.refit (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
refitparameter infit.x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
xparameter infit.
- Returns:
self – The updated object.
- Return type:
object
- set_predict_request(*, x: bool | None | str = '$UNCHANGED$') SSPOC
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
xparameter inpredict.- Returns:
self – The updated object.
- Return type:
object