pysensors package¶
Subpackages¶
Module contents¶
- class pysensors.SSPOR(basis=None, optimizer=None, n_sensors=None)[source]¶
Bases:
BaseEstimator
Sparse Sensor Placement Optimization for Reconstruction: a model for selecting the best sensor locations for state reconstruction.
Given a basis in which to represent the state (e.g. PCA modes) along with measurement data, a
SSPOR
instance produces a list of sensor locations (a permutation of the numbers 0, 1, …,n_input_features
- 1) ranked in descending order of importance. One can then select the top k sensors and take future measurements at that limited set of locations.The overall time complexity of fitting a
SSPOR
object isO(n_basis_modes * n_input_features * n_input_features)
plus the cost for fitting the basis. Different bases have different complexities. The space complexity isO(n_basis_modes * n_input_features)
.- Parameters
basis (basis object, optional (default
pysensors.basis.Identity
)) – Basis in which to represent the data. Default is the identity basis (i.e. raw features).optimizer (optimizer object, optional (default
pysensors.optimizers.QR
)) – Optimization method used to rank sensor locations.n_sensors (int, optional (default n_input_features)) – Number of sensors to select. Note that
s = SSPOR(n_sensors=10); s.fit(x)
is equivalent tos = SSPOR(); s.fit(x); s.set_number_of_sensors(10)
.
- Attributes
n_basis_modes (int) – Number of basis modes considered during fitting.
basis_matrix_ (np.ndarray) – Internal representation of the basis.
ranked_sensors_ (np.ndarray) – Sensor locations ranked in descending order of importance.
Examples
>>> import numpy as np >>> from pysensors import SSPOR >>> >>> x = np.linspace(0, 1, 501) >>> monomials = np.vander(x, 15).T >>> >>> model = SSPOR(n_sensors=5) >>> model.fit(monomials) SSPOR(basis=Identity(n_basis_modes=15), n_sensors=5, optimizer=QR()) >>> print(model.selected_sensors) [500 377 0 460 185] >>> print(x[model.selected_sensors]) [1. 0.754 0. 0.92 0.37 ] >>> model.set_n_sensors(7) >>> print(x[model.selected_sensors]) [1. 0.754 0. 0.92 0.37 0.572 0.134] >>> f = np.sin(3*x) >>> f_pred = model.predict(f[model.selected_sensors]) >>> print(np.linalg.norm(f - f_pred)) 0.022405698005838044
- fit(x, quiet=False, prefit_basis=False, seed=None, **optimizer_kws)[source]¶
Fit the SSPOR model, determining which sensors are relevant.
- Parameters
x (array-like, shape (n_samples, n_input_features)) – Training data.
quiet (boolean, optional (default False)) – Whether or not to suppress warnings during fitting.
prefit_basis (boolean, optional (default False)) – Whether or not the basis has already been fit to x. For example, you may have already fit and experimented with a
SVD
object to determine the optimal number of modes. This option allows you to avoid an unnecessary SVD.seed (int, optional (default None)) – Seed for the random number generator used to shuffle sensors after the
self.basis.n_basis_modes
sensor. Most optimizers only rank the topself.basis.n_basis_modes
sensors, leaving the rest virtually untouched. As a result the remaining samples are randomly permuted.optimizer_kws (dict, optional) – Keyword arguments to be passed to the
get_sensors
method of the optimizer.
- Returns
self
- Return type
a fitted
SSPOR
instance
- predict(x, **solve_kws)[source]¶
Predict values at all positions given measurements at sensor locations.
- Parameters
x (array-like, shape (n_samples, n_sensors)) – Measurements from which to form prediction. The measurements should be taken at the sensor locations specified by
self.get_selected_sensors()
.solve_kws (dict, optional) – keyword arguments to be passed to the linear solver used to invert the basis matrix.
- Returns
y – Predicted values at every location.
- Return type
numpy array, shape (n_samples, n_features)
- get_selected_sensors()[source]¶
Get the indices of the sensors chosen by the model.
- Returns
sensors – Indices of the sensors chosen by the model (i.e. the sensor locations) ranked in descending order of importance.
- Return type
numpy array, shape (n_sensors,)
- property selected_sensors¶
Get the indices of the sensors chosen by the model.
- Returns
sensors – Indices of the sensors chosen by the model (i.e. the sensor locations) ranked in descending order of importance.
- Return type
numpy array, shape (n_sensors,)
- get_all_sensors()[source]¶
Get a ranked list consisting of all the sensors. The sensors are given in descending order of importance.
- Returns
sensors – Indices of sensors in descending order of importance.
- Return type
numpy array, shape (n_features,)
- property all_sensors¶
Get a ranked list consisting of all the sensors. The sensors are given in descending order of importance.
- Returns
sensors – Indices of sensors in descending order of importance.
- Return type
numpy array, shape (n_features,)
- set_number_of_sensors(n_sensors)[source]¶
Set
n_sensors
, the number of sensors to be used for prediction.- Parameters
n_sensors (int) – The number of sensors. Must be a positive integer. Cannot exceed the number of available sensors (n_features).
- set_n_sensors(n_sensors)[source]¶
A convenience function accomplishing the same thing as
set_number_of_sensors
. Setn_sensors
, the number of sensors to be used for prediction.- Parameters
n_sensors (int) – The number of sensors. Must be a positive integer. Cannot exceed the number of available sensors (n_features).
- update_n_basis_modes(n_basis_modes, x=None, quiet=False)[source]¶
Re-fit the
SSPOR
object using a different value ofn_basis_modes
.This method allows one to relearn sensor locations for a different number of basis modes _without_ re-fitting the basis in many cases. Specifically, if
n_basis_modes <= self.basis.n_basis_modes
then the basis does not need to be refit. Otherwise this function does not save any computational resources.- Parameters
n_basis_modes (positive int, optional (default None)) – Number of basis modes to be used during fit. Must be less than or equal to
n_samples
.x (numpy array, shape (n_examples, n_features), optional (default None)) – Only used if
n_basis_modes
exceeds the number of available basis modes for the already fit basis.quiet (boolean, optional (default False)) – Whether or not to suppress warnings during refitting.
- score(x, y=None, score_function=None, score_kws={}, solve_kws={})[source]¶
Compute the reconstruction error for a given set of measurements.
- Parameters
x (numpy array, shape (n_examples, n_features)) – Measurements with which to compute the score. Note that
x
should consist of measurements at every location, not just the recommended sensor location, i.e. its shape should be (n_examples, n_features) rather than (n_examples, n_sensors).y (None) – Dummy input to maintain compatibility with Scikit-learn.
score_function (callable, optional (default None)) – Function used to compute the score. Should have the call signature
score_function(y_true, y_pred, **score_kws)
. Default is the negative of the root mean squared error (sklearn expects higher scores to correspond to better performance).score_kws (dict, optional) – Keyword arguments to be passed to score_function. Ignored if score_function is None.
solve_kws (dict, optional) – Keyword arguments to be passed to the predict method.
- Returns
score – The score.
- Return type
float
- reconstruction_error(x_test, sensor_range=None, score=None, **solve_kws)[source]¶
Compute the reconstruction error for different numbers of sensors.
- Parameters
x_test (numpy array, shape (n_examples, n_features)) – Measurements to be reconstructed.
sensor_range (1D numpy array, optional (default None)) – Numbers of sensors at which to compute the reconstruction error. If None, will be set to [1, 2, … , min(
n_sensors
,basis.n_basis_modes
)].score (callable, optional (default None)) – Function used to compute the reconstruction error. Should have the signature
score(x, x_pred)
. If None, the root mean squared error is used.solve_kws (dict, optional) – Keyword arguments to be passed to the linear solver.
- Returns
error – Reconstruction scores for each number of sensors in
sensor_range
.- Return type
numpy array, shape (len(sensor_range),)
- class pysensors.SSPOC(basis=None, classifier=None, n_sensors=None, threshold=None, l1_penalty=0.1)[source]¶
Bases:
BaseEstimator
Sparse Sensor Placement Optimization for Classification (SSPOC) object.
As the name suggests, this class can be used to select optimal sensor locations (measurement locations) for classification tasks.
The time complexity of the SSPOC algorithm can be decomposed as
\[C_{total} = C_{basis} + C_{classification} + C_{optimization}\]\(C_{basis}\): the complexity of fitting the selected basis object and producing the matrix inverse. The matrix inverse is “free” to compute for
pysensors.basis.Identity
andpysensors.basis.SVD
. Forpysensors.basis.RandomProjection
the complexity is that of callingnumpy.linalg.pinv
on a matrix of sizen_input_features * n_basis_modes
.\(C_{classification}\): the cost of fitting the chosen classifier to
n_examples
examples withn_basis_modes
features.\(C_{optimization}\): the cost of solving the sensor optimization problem. For binary classification we use
sklearn.linear_model.OrthogonalMatchingPursuit
. For multi-class classification we usesklearn.linear_model.MultiTaskLasso
. The costs for each depend on the fit options that are specified. In both cases there aren_basis_modes
examples withn_features
features.
The space complexity likewise depends on the same three factors. Generally, the basis requires
O(n_basis_modes * n_features)
space. The space requirements for classification and optimization depend on the particular algorithms being employed. See the Scikit-learn documentation for specifics.See the following reference for more information:
Brunton, Bingni W., et al. “Sparse sensor placement optimization for classification.” SIAM Journal on Applied Mathematics 76.5 (2016): 2099-2122.
- Parameters
basis (basis object, optional (default
pysensors.basis.Identity
)) – Basis in which to represent the data. Default is the identity basis (i.e. raw features).classifier (classifier object, optional ) – (default Linear Discriminant Analysis (LDA)) Classifier for which to optimize sensors. Must be a linear classifier with a
coef_
attribute andfit
andpredict
methods.n_sensors (positive integer, optional (default None)) – Number of sensor locations to be used after fitting. If
n_sensors
is not None then it overrides thethreshold
parameter. If set to 0, thenclassifier
will be replaced with a dummy classifier which predicts the class randomly.threshold (nonnegative float, optional (default None)) –
Threshold for selecting sensors. Overriden by
n_sensors
. If boththreshold
andn_sensors
are None when thefit
method is called, then the threshold will be set to\[\frac{\|s\|_F}{2rc}\]where \(s\) is a sensor coefficient matrix, \(r\) is the number of basis modes, and \(c\) is the number of distinct classes, as suggested in Brunton et al. (2016).
l1_penalty (nonnegative float, optional (default 0.1)) – The L1 penalty term used to form the sensor coefficient matrix, s. Larger values will result in a sparser s and fewer selected sensors. This parameter is ignored for binary classification problems.
- Attributes
n_basis_modes (nonnegative integer) – Number of basis modes to be used when deciding sensor locations.
basis_matrix_inverse_ (np.ndarray, shape (n_basis_modes, n_input_features)) – The inverse of the matrix of basis vectors.
sensor_coef_ (np.ndarray, shape (n_input_features, n_classes)) – The sensor coefficient matrix, s.
sparse_sensors_ (np.ndarray, shape (n_sensors, )) – The selected sensors.
Examples
>>> from sklearn.metrics import accuracy_score >>> from sklearn.datasets import make_classification >>> from pysensors.classification import SSPOC >>> >>> x, y = make_classification(n_classes=3, n_informative=3, random_state=10) >>> >>> model = SSPOC(n_sensors=10, l1_penalty=0.03) >>> model.fit(x, y, quiet=True) SSPOC(basis=Identity(n_basis_modes=100), classifier=LinearDiscriminantAnalysis(), l1_penalty=0.03, n_sensors=10) >>> print(model.selected_sensors) [10 13 6 19 17 16 15 14 12 11] >>> >>> acc = accuracy_score(y, model.predict(x[:, model.selected_sensors])) >>> print("Accuracy:", acc) Accuracy: 0.66 >>> >>> model.update_sensors(n_sensors=5, xy=(x, y), quiet=True) >>> print(model.selected_sensors) [10 13 6 19 17] >>> >>> acc = accuracy_score(y, model.predict(x[:, model.selected_sensors])) >>> print("Accuracy:", acc) Accuracy: 0.6
- fit(x, y, quiet=False, prefit_basis=False, refit=True, **optimizer_kws)[source]¶
Fit the SSPOC model, determining which sensors are relevant.
- Parameters
x (array-like, shape (n_samples, n_input_features)) – Training data.
y (array-like, shape (n_samples,)) – Training labels.
quiet (boolean, optional (default False)) – Whether or not to suppress warnings during fitting.
prefit_basis (boolean, optional (default False)) – Whether or not the basis has already been fit to x. For example, you may have already fit and experimented with a
SVD
object to determine the optimal number of modes. This option allows you to avoid an unnecessary SVD.refit (boolean, optional (default True)) – Whether or not to refit the classifier using measurements only from the learned sensor locations.
optimizer_kws (dict, optional) – Keyword arguments to be passed to the optimization routine.
- Returns
self
- Return type
a fitted
SSPOC
instance
- predict(x)[source]¶
Predict classes for given measurements. If
self.n_sensors
is 0 then a dummy classifier is used in place ofself.classifier
.- Parameters
x (array-like, shape (n_samples, n_sensors) or (n_samples, n_features)) – Examples to be classified. The measurements should be taken at the sensor locations specified by
self.selected_sensors
.- Returns
y – Predicted classes.
- Return type
np.ndarray, shape (n_samples,)
- update_sensors(n_sensors=None, threshold=None, xy=None, quiet=False, method=<function amax>, **method_kws)[source]¶
Update the selected sensors by changing either the preferred number of sensors or the threshold used to select the sensors, refitting the classifier afterwards, if possible.
- Parameters
n_sensors (nonnegative integer, optional (default None)) – The number of sensor locations to select. If None, then
threshold
will be used to pick the sensors. Note thatn_sensors
andthreshold
cannot both be None.threshold (nonnegative float, optional (default None)) – The threshold to use to select sensors based on the magnitudes of entries in
self.sensor_coef_
(s). Overridden byn_sensors
. Note thatn_sensors
andthreshold
cannot both be None.xy (tuple of np.ndarray, length 2, optional (default None)) – Tuple containing training data x and labels y for refitting. x should have shape (n_samples, n_input_features) and y shape (n_samples, ). If not None, the classifier will be refit after the new sensors have been selected.
quiet (boolean, optional (default False)) – Whether to silence warnings.
method (callable, optional (default
np.max
)) – Function used along withthreshold
to select sensors. For binary classification problems one need not specify a method. For multiclass classification problems,sensor_coef_
(s) has multiple columns andmethod
is applied along each row to aggregate coefficients for thresholding, i.e.method
is called as followsmethod(np.abs(self.sensor_coef_), axis=1, **method_kws)
. Other examples of acceptable methods arenp.min
,np.mean
, andnp.median
.**method_kws (dict, optional) – Keyword arguments to be passed into
method
when it is called.
- update_n_basis_modes(n_basis_modes, xy, **fit_kws)[source]¶
Re-fit the
SSPOC
object using a different value ofn_basis_modes
.This method allows one to relearn sensor locations for a different number of basis modes _without_ re-fitting the basis in many cases. Specifically, if
n_basis_modes <= self.basis.n_basis_modes
then the basis does not need to be refit. Otherwise this function does not save any computational resources.- Parameters
n_basis_modes (positive int, optional (default None)) – Number of basis modes to be used during fit. Must be less than or equal to
n_samples
.xy (tuple of np.ndarray, length 2) – Tuple containing training data x and labels y for refitting. x should have shape (n_samples, n_input_features) and y shape (n_samples, ).
**fit_kws (dict, optional) – Keyword arguments to pass to
SSPOC.fit
.
- property selected_sensors¶
Get the indices of the selected sensors.
- Returns
sensors – Indices of the selected sensors.
- Return type
numpy array, shape (n_sensors,)