Documentation of External and Wrapped Nodes¶

pySPACE comes along with wrappers to external algorithms.

For details on the usage of the nodes and for getting usage examples, have a look at their documentation. Module for external node wrapping: pySPACE.missions.nodes.external

Scikit Nodes¶

Nodes from scikits wrapper

`pySPACE.missions.nodes.regression.scikit_decorators.OptSVRRegressorSklearnNode`¶

class pySPACE.missions.nodes.regression.scikit_decorators.OptSVRRegressorSklearnNode(C=1, epsilon=0.1, kernel='rbf', degree=3, gamma='auto', coef0=0.0, shrinking=True, tol=0.001, verbose=False, max_iter=-1, **kwargs)[source]¶

Bases: pySPACE.missions.nodes.scikit_nodes.SVRRegressorSklearnNode

Decorator wrapper around SVRRegressorSklearnNode

Epsilon-Support Vector Regression.

This node has been automatically generated by wrapping the sklearn.svm.classes.SVR class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The free parameters in the model are C and epsilon.

The implementation is based on libsvm.

Read more in the User Guide.

Parameters

C: Penalty parameter C of the error term.
epsilon: Epsilon in the epsilon-SVR model. It specifies the epsilon-tube within which no penalty is associated in the training loss function with points predicted within a distance epsilon from the actual value.
kernel: Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. If a callable is given it is used to precompute the kernel matrix.
degree: Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
gamma: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. If gamma is ‘auto’ then 1/n_features will be used instead.
coef0: Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’.
shrinking: Whether to use the shrinking heuristic.
tol: Tolerance for stopping criterion.
cache_size: Specify the size of the kernel cache (in MB).
verbose: Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in libsvm that, if enabled, may not work properly in a multithreaded context.
max_iter: Hard limit on iterations within solver, or -1 for no limit.

Attributes

support_

:array-like, shape = [n_SV]Indices of support vectors.

support_vectors_

:array-like, shape = [nSV, n_features]Support vectors.

dual_coef_

:array, shape = [1, n_SV]Coefficients of the support vector in the decision function.

coef_

:array, shape = [1, n_features]

Weights assigned to the features (coefficients in the primal problem). This is only available in the case of a linear kernel.

coef_ is readonly property derived from dual_coef_ and support_vectors_.

intercept_

:array, shape = [1]Constants in decision function.

Examples

>>> from sklearn.svm import SVR
>>> import numpy as np
>>> n_samples, n_features = 10, 5
>>> np.random.seed(0)
>>> y = np.random.randn(n_samples)
>>> X = np.random.randn(n_samples, n_features)
>>> clf = SVR(C=1.0, epsilon=0.2)
>>> clf.fit(X, y) 
SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.2, gamma='auto',
    kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

See also

NuSVR: Support Vector Machine for regression implemented using libsvm using a parameter to control the number of support vectors.
LinearSVR: Scalable Linear Support Vector Machine for regression implemented using liblinear.

POSSIBLE NODE NAMES:
	OptSVRRegressorSklearnNode OptSVRRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ARDRegressionSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ARDRegressionSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Bayesian ARD regression.

This node has been automatically generated by wrapping the sklearn.linear_model.bayes.ARDRegression class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Fit the weights of a regression model, using an ARD prior. The weights of the regression model are assumed to be in Gaussian distributions. Also estimate the parameters lambda (precisions of the distributions of the weights) and alpha (precision of the distribution of the noise). The estimation is done by an iterative procedures (Evidence Maximization)

Read more in the User Guide.

Parameters

n_iter: Maximum number of iterations. Default is 300
tol: Stop the algorithm if w has converged. Default is 1.e-3.
alpha_1: Hyper-parameter : shape parameter for the Gamma distribution prior over the alpha parameter. Default is 1.e-6.
alpha_2: Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the alpha parameter. Default is 1.e-6.
lambda_1: Hyper-parameter : shape parameter for the Gamma distribution prior over the lambda parameter. Default is 1.e-6.
lambda_2: Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the lambda parameter. Default is 1.e-6.
compute_score: If True, compute the objective function at each step of the model. Default is False.
threshold_lambda: threshold for removing (pruning) weights with high precision from the computation. Default is 1.e+4.
fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered). Default is True.
normalize: If True, the regressors X will be normalized before regression.
copy_X: If True, X will be copied; else, it may be overwritten.
verbose: Verbose mode when fitting the model.

Attributes

coef_: Coefficients of the regression model (mean of distribution)
alpha_: estimated precision of the noise.
lambda_: estimated precisions of the weights.
sigma_: estimated variance-covariance matrix of the weights
scores_: if computed, value of the objective function (to be maximized)

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.ARDRegression()
>>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
... 
ARDRegression(alpha_1=1e-06, alpha_2=1e-06, compute_score=False,
        copy_X=True, fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06,
        n_iter=300, normalize=False, threshold_lambda=10000.0, tol=0.001,
        verbose=False)
>>> clf.predict([[1, 1]])
array([ 1.])

Notes

See examples/linear_model/plot_ard.py for an example.

POSSIBLE NODE NAMES:
	ARDRegressionSklearn ARDRegressionSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.AdaBoostClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.AdaBoostClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

An AdaBoost classifier.

This node has been automatically generated by wrapping the sklearn.ensemble.weight_boosting.AdaBoostClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

An AdaBoost [1] classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.

This class implements the algorithm known as AdaBoost-SAMME [2].

Read more in the User Guide.

Parameters

base_estimator: The base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes.
n_estimators: The maximum number of estimators at which boosting is terminated. In case of perfect fit, the learning procedure is stopped early.
learning_rate: Learning rate shrinks the contribution of each classifier by learning_rate. There is a trade-off between learning_rate and n_estimators.
algorithm: If ‘SAMME.R’ then use the SAMME.R real boosting algorithm. base_estimator must support calculation of class probabilities. If ‘SAMME’ then use the SAMME discrete boosting algorithm. The SAMME.R algorithm typically converges faster than SAMME, achieving a lower test error with fewer boosting iterations.
random_state: If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

Attributes

estimators_: The collection of fitted sub-estimators.
classes_: The classes labels.
n_classes_: The number of classes.
estimator_weights_: Weights for each estimator in the boosted ensemble.
estimator_errors_: Classification error for each estimator in the boosted ensemble.
feature_importances_: The feature importances if supported by the base_estimator.

See also

AdaBoostRegressor, GradientBoostingClassifier, DecisionTreeClassifier

References

[1]	Y. Freund, R. Schapire, “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting”, 1995.

[2]	Zhu, H. Zou, S. Rosset, T. Hastie, “Multi-class AdaBoost”, 2009.

POSSIBLE NODE NAMES:
	AdaBoostClassifierSklearnNode AdaBoostClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.AdaBoostRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.AdaBoostRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

An AdaBoost regressor.

This node has been automatically generated by wrapping the sklearn.ensemble.weight_boosting.AdaBoostRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

An AdaBoost [1] regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.

This class implements the algorithm known as AdaBoost.R2 [2].

Read more in the User Guide.

Parameters

base_estimator: The base estimator from which the boosted ensemble is built. Support for sample weighting is required.
n_estimators: The maximum number of estimators at which boosting is terminated. In case of perfect fit, the learning procedure is stopped early.
learning_rate: Learning rate shrinks the contribution of each regressor by learning_rate. There is a trade-off between learning_rate and n_estimators.
loss: The loss function to use when updating the weights after each boosting iteration.
random_state: If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

Attributes

estimators_: The collection of fitted sub-estimators.
estimator_weights_: Weights for each estimator in the boosted ensemble.
estimator_errors_: Regression error for each estimator in the boosted ensemble.
feature_importances_: The feature importances if supported by the base_estimator.

See also

AdaBoostClassifier, GradientBoostingRegressor, DecisionTreeRegressor

References

[1]	Y. Freund, R. Schapire, “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting”, 1995.

[2]	Drucker, “Improving Regressors using Boosting Techniques”, 1997.

POSSIBLE NODE NAMES:
	AdaBoostRegressorSklearnNode AdaBoostRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.AdditiveChi2SamplerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.AdditiveChi2SamplerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Approximate feature map for additive chi2 kernel.

This node has been automatically generated by wrapping the sklearn.kernel_approximation.AdditiveChi2Sampler class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Uses sampling the fourier transform of the kernel characteristic at regular intervals.

Since the kernel that is to be approximated is additive, the components of the input vectors can be treated separately. Each entry in the original space is transformed into 2*sample_steps+1 features, where sample_steps is a parameter of the method. Typical values of sample_steps include 1, 2 and 3.

Optimal choices for the sampling interval for certain data ranges can be computed (see the reference). The default values should be reasonable.

Read more in the User Guide.

Parameters

sample_steps: Gives the number of (complex) sampling points.
sample_interval: Sampling interval. Must be specified when sample_steps not in {1,2,3}.

Notes

This estimator approximates a slightly different version of the additive chi squared kernel then metric.additive_chi2 computes.

See also

SkewedChi2Sampler: the chi squared kernel.

sklearn.metrics.pairwise.chi2_kernel : The exact chi squared kernel.

sklearn.metrics.pairwise.additive_chi2_kernel: squared kernel.

References

See “Efficient additive kernels via explicit feature maps” A. Vedaldi and A. Zisserman, Pattern Analysis and Machine Intelligence, 2011

POSSIBLE NODE NAMES:
	AdditiveChi2SamplerTransformerSklearnNode AdditiveChi2SamplerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.BaggingClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.BaggingClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

A Bagging classifier.

This node has been automatically generated by wrapping the sklearn.ensemble.bagging.BaggingClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

This algorithm encompasses several works from the literature. When random subsets of the dataset are drawn as random subsets of the samples, then this algorithm is known as Pasting [1]_. If samples are drawn with replacement, then the method is known as Bagging [2]_. When random subsets of the dataset are drawn as random subsets of the features, then the method is known as Random Subspaces [3]_. Finally, when base estimators are built on subsets of both samples and features, then the method is known as Random Patches [4]_.

Read more in the User Guide.

Parameters

base_estimator

:object or None, optional (default=None)The base estimator to fit on random subsets of the dataset. If None, then the base estimator is a decision tree.

n_estimators

:int, optional (default=10)The number of base estimators in the ensemble.

max_samples

:int or float, optional (default=1.0)

The number of samples to draw from X to train each base estimator.

If int, then draw max_samples samples.
If float, then draw max_samples * X.shape[0] samples.

max_features

:int or float, optional (default=1.0)

The number of features to draw from X to train each base estimator.

If int, then draw max_features features.
If float, then draw max_features * X.shape[1] features.

bootstrap

:boolean, optional (default=True)Whether samples are drawn with replacement.

bootstrap_features

:boolean, optional (default=False)Whether features are drawn with replacement.

oob_score

:boolWhether to use out-of-bag samples to estimate the generalization error.

warm_start

:bool, optional (default=False)

When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble.

New in version 0.17: warm_start constructor parameter.

n_jobs

:int, optional (default=1)The number of jobs to run in parallel for both fit and predict. If -1, then the number of jobs is set to the number of cores.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

verbose

:int, optional (default=0)Controls the verbosity of the building process.

Attributes

base_estimator_: The base estimator from which the ensemble is grown.
estimators_: The collection of fitted base estimators.
estimators_samples_: The subset of drawn samples (i.e., the in-bag samples) for each base estimator.
estimators_features_: The subset of drawn features for each base estimator.
classes_: The classes labels.
n_classes_: The number of classes.
oob_score_: Score of the training dataset obtained using an out-of-bag estimate.
oob_decision_function_: Decision function computed with out-of-bag estimate on the training set. If n_estimators is small it might be possible that a data point was never left out during the bootstrap. In this case, oob_decision_function_ might contain NaN.

References

[1]	L. Breiman, “Pasting small votes for classification in large databases and on-line”, Machine Learning, 36(1), 85-103, 1999.

[2]	L. Breiman, “Bagging predictors”, Machine Learning, 24(2), 123-140, 1996.

[3]	T. Ho, “The random subspace method for constructing decision forests”, Pattern Analysis and Machine Intelligence, 20(8), 832-844, 1998.

[4]	G. Louppe and P. Geurts, “Ensembles on Random Patches”, Machine Learning and Knowledge Discovery in Databases, 346-361, 2012.

POSSIBLE NODE NAMES:
	BaggingClassifierSklearn BaggingClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.BaggingRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.BaggingRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

A Bagging regressor.

This node has been automatically generated by wrapping the sklearn.ensemble.bagging.BaggingRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

This algorithm encompasses several works from the literature. When random subsets of the dataset are drawn as random subsets of the samples, then this algorithm is known as Pasting [1]_. If samples are drawn with replacement, then the method is known as Bagging [2]_. When random subsets of the dataset are drawn as random subsets of the features, then the method is known as Random Subspaces [3]_. Finally, when base estimators are built on subsets of both samples and features, then the method is known as Random Patches [4]_.

Read more in the User Guide.

Parameters

base_estimator

:object or None, optional (default=None)The base estimator to fit on random subsets of the dataset. If None, then the base estimator is a decision tree.

n_estimators

:int, optional (default=10)The number of base estimators in the ensemble.

max_samples

:int or float, optional (default=1.0)

The number of samples to draw from X to train each base estimator.

If int, then draw max_samples samples.
If float, then draw max_samples * X.shape[0] samples.

max_features

:int or float, optional (default=1.0)

The number of features to draw from X to train each base estimator.

If int, then draw max_features features.
If float, then draw max_features * X.shape[1] features.

bootstrap

:boolean, optional (default=True)Whether samples are drawn with replacement.

bootstrap_features

:boolean, optional (default=False)Whether features are drawn with replacement.

oob_score

:boolWhether to use out-of-bag samples to estimate the generalization error.

warm_start

:bool, optional (default=False)When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble.

n_jobs

:int, optional (default=1)The number of jobs to run in parallel for both fit and predict. If -1, then the number of jobs is set to the number of cores.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

verbose

:int, optional (default=0)Controls the verbosity of the building process.

Attributes

estimators_: The collection of fitted sub-estimators.
estimators_samples_: The subset of drawn samples (i.e., the in-bag samples) for each base estimator.
estimators_features_: The subset of drawn features for each base estimator.
oob_score_: Score of the training dataset obtained using an out-of-bag estimate.
oob_prediction_: Prediction computed with out-of-bag estimate on the training set. If n_estimators is small it might be possible that a data point was never left out during the bootstrap. In this case, oob_prediction_ might contain NaN.

References

[1]	L. Breiman, “Pasting small votes for classification in large databases and on-line”, Machine Learning, 36(1), 85-103, 1999.

[2]	L. Breiman, “Bagging predictors”, Machine Learning, 24(2), 123-140, 1996.

[3]	T. Ho, “The random subspace method for constructing decision forests”, Pattern Analysis and Machine Intelligence, 20(8), 832-844, 1998.

[4]	G. Louppe and P. Geurts, “Ensembles on Random Patches”, Machine Learning and Knowledge Discovery in Databases, 346-361, 2012.

POSSIBLE NODE NAMES:
	BaggingRegressorSklearnNode BaggingRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.BayesianRidgeRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.BayesianRidgeRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Bayesian ridge regression

This node has been automatically generated by wrapping the sklearn.linear_model.bayes.BayesianRidge class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Fit a Bayesian ridge model and optimize the regularization parameters lambda (precision of the weights) and alpha (precision of the noise).

Read more in the User Guide.

Parameters

n_iter: Maximum number of iterations. Default is 300.
tol: Stop the algorithm if w has converged. Default is 1.e-3.
alpha_1: Hyper-parameter : shape parameter for the Gamma distribution prior over the alpha parameter. Default is 1.e-6
alpha_2: Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the alpha parameter. Default is 1.e-6.
lambda_1: Hyper-parameter : shape parameter for the Gamma distribution prior over the lambda parameter. Default is 1.e-6.
lambda_2: Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the lambda parameter. Default is 1.e-6
compute_score: If True, compute the objective function at each step of the model. Default is False
fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered). Default is True.
normalize: If True, the regressors X will be normalized before regression.
copy_X: If True, X will be copied; else, it may be overwritten.
verbose: Verbose mode when fitting the model.

Attributes

coef_: Coefficients of the regression model (mean of distribution)
alpha_: estimated precision of the noise.
lambda_: estimated precisions of the weights.
scores_: if computed, value of the objective function (to be maximized)

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.BayesianRidge()
>>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
... 
BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False,
        copy_X=True, fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06,
        n_iter=300, normalize=False, tol=0.001, verbose=False)
>>> clf.predict([[1, 1]])
array([ 1.])

Notes

See examples/linear_model/plot_bayesian_ridge.py for an example.

POSSIBLE NODE NAMES:
	BayesianRidgeRegressorSklearn BayesianRidgeRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.BernoulliNBClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.BernoulliNBClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Naive Bayes classifier for multivariate Bernoulli models.

This node has been automatically generated by wrapping the sklearn.naive_bayes.BernoulliNB class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Like MultinomialNB, this classifier is suitable for discrete data. The difference is that while MultinomialNB works with occurrence counts, BernoulliNB is designed for binary/boolean features.

Read more in the User Guide.

Parameters

alpha: Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing).
binarize: Threshold for binarizing (mapping to booleans) of sample features. If None, input is presumed to already consist of binary vectors.
fit_prior: Whether to learn class prior probabilities or not. If false, a uniform prior will be used.
class_prior: Prior probabilities of the classes. If specified the priors are not adjusted according to the data.

Attributes

class_log_prior_: Log probability of each class (smoothed).
feature_log_prob_: Empirical log probability of features given a class, P(x_i|y).
class_count_: Number of samples encountered for each class during fitting. This value is weighted by the sample weight when provided.
feature_count_: Number of samples encountered for each (class, feature) during fitting. This value is weighted by the sample weight when provided.

Examples

>>> import numpy as np
>>> X = np.random.randint(2, size=(6, 100))
>>> Y = np.array([1, 2, 3, 4, 4, 5])
>>> from sklearn.naive_bayes import BernoulliNB
>>> clf = BernoulliNB()
>>> clf.fit(X, Y)
BernoulliNB(alpha=1.0, binarize=0.0, class_prior=None, fit_prior=True)
>>> print(clf.predict(X[2:3]))
[3]

References

C.D. Manning, P. Raghavan and H. Schuetze (2008). Introduction to Information Retrieval. Cambridge University Press, pp. 234-265. http://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html

A. McCallum and K. Nigam (1998). A comparison of event models for naive Bayes text classification. Proc. AAAI/ICML-98 Workshop on Learning for Text Categorization, pp. 41-48.

V. Metsis, I. Androutsopoulos and G. Paliouras (2006). Spam filtering with naive Bayes – Which naive Bayes? 3rd Conf. on Email and Anti-Spam (CEAS).

POSSIBLE NODE NAMES:
	BernoulliNBClassifierSklearnNode BernoulliNBClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.BernoulliRBMTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.BernoulliRBMTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Bernoulli Restricted Boltzmann Machine (RBM).

This node has been automatically generated by wrapping the sklearn.neural_network.rbm.BernoulliRBM class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A Restricted Boltzmann Machine with binary visible units and binary hiddens. Parameters are estimated using Stochastic Maximum Likelihood (SML), also known as Persistent Contrastive Divergence (PCD) [2].

The time complexity of this implementation is O(d ** 2) assuming d ~ n_features ~ n_components.

Read more in the User Guide.

Parameters

n_components: Number of binary hidden units.
learning_rate: The learning rate for weight updates. It is highly recommended to tune this hyper-parameter. Reasonable values are in the 10**[0., -3.] range.
batch_size: Number of examples per minibatch.
n_iter: Number of iterations/sweeps over the training dataset to perform during training.
verbose: The verbosity level. The default, zero, means silent mode.
random_state: A random number generator instance to define the state of the random permutations generator. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

Attributes

intercept_hidden_: Biases of the hidden units.
intercept_visible_: Biases of the visible units.
components_: Weight matrix, where n_features in the number of visible units and n_components is the number of hidden units.

Examples

>>> import numpy as np
>>> from sklearn.neural_network import BernoulliRBM
>>> X = np.array([[0, 0, 0], [0, 1, 1], [1, 0, 1], [1, 1, 1]])
>>> model = BernoulliRBM(n_components=2)
>>> model.fit(X)
BernoulliRBM(batch_size=10, learning_rate=0.1, n_components=2, n_iter=10,
       random_state=None, verbose=0)

References

[1] Hinton, G. E., Osindero, S. and Teh, Y. A fast learning algorithm for: deep belief nets. Neural Computation 18, pp 1527-1554. http://www.cs.toronto.edu/~hinton/absps/fastnc.pdf
[2] Tieleman, T. Training Restricted Boltzmann Machines using: Approximations to the Likelihood Gradient. International Conference on Machine Learning (ICML) 2008

POSSIBLE NODE NAMES:
	BernoulliRBMTransformerSklearnNode BernoulliRBMTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.BinarizerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.BinarizerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Binarize data (set feature values to 0 or 1) according to a threshold

This node has been automatically generated by wrapping the sklearn.preprocessing.data.Binarizer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Values greater than the threshold map to 1, while values less than or equal to the threshold map to 0. With the default threshold of 0, only positive values map to 1.

Binarization is a common operation on text count data where the analyst can decide to only consider the presence or absence of a feature rather than a quantified number of occurrences for instance.

It can also be used as a pre-processing step for estimators that consider boolean random variables (e.g. modelled using the Bernoulli distribution in a Bayesian setting).

Read more in the User Guide.

Parameters

threshold: Feature values below or equal to this are replaced by 0, above it by 1. Threshold may not be less than 0 for operations on sparse matrices.
copy: set to False to perform inplace binarization and avoid a copy (if the input is already a numpy array or a scipy.sparse CSR matrix).

Notes

If the input is a sparse matrix, only the non-zero values are subject to update by the Binarizer class.

This estimator is stateless (besides constructor parameters), the fit method does nothing but is useful when used in a pipeline.

POSSIBLE NODE NAMES:
	BinarizerTransformerSklearn BinarizerTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.BirchTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.BirchTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Implements the Birch clustering algorithm.

This node has been automatically generated by wrapping the sklearn.cluster.birch.Birch class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Every new sample is inserted into the root of the Clustering Feature Tree. It is then clubbed together with the subcluster that has the centroid closest to the new sample. This is done recursively till it ends up at the subcluster of the leaf of the tree has the closest centroid.

Read more in the User Guide.

Parameters

threshold: The radius of the subcluster obtained by merging a new sample and the closest subcluster should be lesser than the threshold. Otherwise a new subcluster is started.
branching_factor: Maximum number of CF subclusters in each node. If a new samples enters such that the number of subclusters exceed the branching_factor then the node has to be split. The corresponding parent also has to be split and if the number of subclusters in the parent is greater than the branching factor, then it has to be split recursively.
n_clusters: Number of clusters after the final clustering step, which treats the subclusters from the leaves as new samples. By default, this final clustering step is not performed and the subclusters are returned as they are. If a model is provided, the model is fit treating the subclusters as new samples and the initial data is mapped to the label of the closest subcluster. If an int is provided, the model fit is AgglomerativeClustering with n_clusters set to the int.
compute_labels: Whether or not to compute labels for each fit.
copy: Whether or not to make a copy of the given data. If set to False, the initial data will be overwritten.

Attributes

root_: Root of the CFTree.
dummy_leaf_: Start pointer to all the leaves.
subcluster_centers_: Centroids of all subclusters read directly from the leaves.
subcluster_labels_: Labels assigned to the centroids of the subclusters after they are clustered globally.
labels_: Array of labels assigned to the input data. if partial_fit is used instead of fit, they are assigned to the last batch of data.

Examples

>>> from sklearn.cluster import Birch
>>> X = [[0, 1], [0.3, 1], [-0.3, 1], [0, -1], [0.3, -1], [-0.3, -1]]
>>> brc = Birch(branching_factor=50, n_clusters=None, threshold=0.5,
... compute_labels=True)
>>> brc.fit(X)
Birch(branching_factor=50, compute_labels=True, copy=True, n_clusters=None,
   threshold=0.5)
>>> brc.predict(X)
array([0, 0, 0, 1, 1, 1])

References

Tian Zhang, Raghu Ramakrishnan, Maron Livny BIRCH: An efficient data clustering method for large databases. http://www.cs.sfu.ca/CourseCentral/459/han/papers/zhang96.pdf
Roberto Perdisci JBirch - Java implementation of BIRCH clustering algorithm https://code.google.com/p/jbirch/

POSSIBLE NODE NAMES:
	BirchTransformerSklearnNode BirchTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.CalibratedClassifierCVSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.CalibratedClassifierCVSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Probability calibration with isotonic regression or sigmoid.

This node has been automatically generated by wrapping the sklearn.calibration.CalibratedClassifierCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

With this class, the base_estimator is fit on the train set of the cross-validation generator and the test set is used for calibration. The probabilities for each of the folds are then averaged for prediction. In case that cv=”prefit” is passed to __init__, it is it is assumed that base_estimator has been fitted already and all data is used for calibration. Note that data for fitting the classifier and for calibrating it must be disjoint.

Read more in the User Guide.

Parameters

base_estimator

:instance BaseEstimatorThe classifier whose output decision function needs to be calibrated to offer more accurate predict_proba outputs. If cv=prefit, the classifier must have been fit already on data.

method

:‘sigmoid’ or ‘isotonic’The method to use for calibration. Can be ‘sigmoid’ which corresponds to Platt’s method or ‘isotonic’ which is a non-parameteric approach. It is not advised to use isotonic calibration with too few calibration samples (<<1000) since it tends to overfit. Use sigmoids (Platt’s calibration) in this case.

cv

:integer, cross-validation generator, iterable or “prefit”, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, if y is binary or multiclass, StratifiedKFold used. If y is neither binary nor multiclass, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

If “prefit” is passed, it is assumed that base_estimator has been fitted already and all data is used for calibration.

Attributes

classes_: The class labels.
calibrated_classifiers_: list (len() equal to cv or 1 if cv == “prefit”): The list of calibrated classifiers, one for each crossvalidation fold, which has been fitted on all but the validation fold and calibrated on the validation fold.

References

[1]	Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers, B. Zadrozny & C. Elkan, ICML 2001

[2]	Transforming Classifier Scores into Accurate Multiclass Probability Estimates, B. Zadrozny & C. Elkan, (KDD 2002)

[3]	Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods, J. Platt, (1999)

[4]	Predicting Good Probabilities with Supervised Learning, A. Niculescu-Mizil & R. Caruana, ICML 2005

POSSIBLE NODE NAMES:
	CalibratedClassifierCVSklearnNode CalibratedClassifierCVSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.CountVectorizerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.CountVectorizerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Convert a collection of text documents to a matrix of token counts

This node has been automatically generated by wrapping the sklearn.feature_extraction.text.CountVectorizer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This implementation produces a sparse representation of the counts using scipy.sparse.coo_matrix.

If you do not provide an a-priori dictionary and you do not use an analyzer that does some kind of feature selection then the number of features will be equal to the vocabulary size found by analyzing the data.

Read more in the User Guide.

Parameters

input

:string {‘filename’, ‘file’, ‘content’}

If ‘filename’, the sequence passed as an argument to fit is expected to be a list of filenames that need reading to fetch the raw content to analyze.

If ‘file’, the sequence items must have a ‘read’ method (file-like object) that is called to fetch the bytes in memory.

Otherwise the input is expected to be the sequence strings or bytes items are expected to be analyzed directly.

encoding

:string, ‘utf-8’ by default.If bytes or files are given to analyze, this encoding is used to decode.

decode_error

:{‘strict’, ‘ignore’, ‘replace’}Instruction on what to do if a byte sequence is given to analyze that contains characters not of the given encoding. By default, it is ‘strict’, meaning that a UnicodeDecodeError will be raised. Other values are ‘ignore’ and ‘replace’.

strip_accents

:{‘ascii’, ‘unicode’, None}Remove accents during the preprocessing step. ‘ascii’ is a fast method that only works on characters that have an direct ASCII mapping. ‘unicode’ is a slightly slower method that works on any characters. None (default) does nothing.

analyzer

:string, {‘word’, ‘char’, ‘char_wb’} or callable

Whether the feature should be made of word or character n-grams. Option ‘char_wb’ creates character n-grams only from text inside word boundaries.

If a callable is passed it is used to extract the sequence of features out of the raw, unprocessed input. Only applies if analyzer == 'word'.

preprocessor

:callable or None (default)Override the preprocessing (string transformation) stage while preserving the tokenizing and n-grams generation steps.

tokenizer

:callable or None (default)Override the string tokenization step while preserving the preprocessing and n-grams generation steps. Only applies if analyzer == 'word'.

ngram_range

:tuple (min_n, max_n)The lower and upper boundary of the range of n-values for different n-grams to be extracted. All values of n such that min_n <= n <= max_n will be used.

stop_words

:string {‘english’}, list, or None (default)

If ‘english’, a built-in stop word list for English is used.

If a list, that list is assumed to contain stop words, all of which will be removed from the resulting tokens. Only applies if analyzer == 'word'.

If None, no stop words will be used. max_df can be set to a value in the range [0.7, 1.0) to automatically detect and filter stop words based on intra corpus document frequency of terms.

lowercase

:boolean, True by defaultConvert all characters to lowercase before tokenizing.

token_pattern

:stringRegular expression denoting what constitutes a “token”, only used if analyzer == 'word'. The default regexp select tokens of 2 or more alphanumeric characters (punctuation is completely ignored and always treated as a token separator).

max_df

:float in range [0.0, 1.0] or int, default=1.0When building the vocabulary ignore terms that have a document frequency strictly higher than the given threshold (corpus-specific stop words). If float, the parameter represents a proportion of documents, integer absolute counts. This parameter is ignored if vocabulary is not None.

min_df

:float in range [0.0, 1.0] or int, default=1When building the vocabulary ignore terms that have a document frequency strictly lower than the given threshold. This value is also called cut-off in the literature. If float, the parameter represents a proportion of documents, integer absolute counts. This parameter is ignored if vocabulary is not None.

max_features

:int or None, default=None

If not None, build a vocabulary that only consider the top max_features ordered by term frequency across the corpus.

This parameter is ignored if vocabulary is not None.

vocabulary

:Mapping or iterable, optionalEither a Mapping (e.g., a dict) where keys are terms and values are indices in the feature matrix, or an iterable over terms. If not given, a vocabulary is determined from the input documents. Indices in the mapping should not be repeated and should not have any gap between 0 and the largest index.

binary

:boolean, default=FalseIf True, all non zero counts are set to 1. This is useful for discrete probabilistic models that model binary events rather than integer counts.

dtype

:type, optionalType of the matrix returned by fit_transform() or transform().

Attributes

vocabulary_

:dictA mapping of terms to feature indices.

stop_words_

:set

Terms that were ignored because they either:

occurred in too many documents (max_df)

occurred in too few documents (min_df)

were cut off by feature selection (max_features).

This is only available if no vocabulary was given.

See also

HashingVectorizer, TfidfVectorizer

Notes

The stop_words_ attribute can get large and increase the model size when pickling. This attribute is provided only for introspection and can be safely removed using delattr or set to None before pickling.

POSSIBLE NODE NAMES:
	CountVectorizerTransformerSklearn CountVectorizerTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.DecisionTreeClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.DecisionTreeClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

A decision tree classifier.

This node has been automatically generated by wrapping the sklearn.tree.tree.DecisionTreeClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

criterion

:string, optional (default=”gini”)The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain.

splitter

:string, optional (default=”best”)The strategy used to choose the split at each node. Supported strategies are “best” to choose the best split and “random” to choose the best random split.

max_features

:int, float, string or None, optional (default=None)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.

If float, then max_features is a percentage and

int(max_features * n_features) features are considered at each

split.

If “auto”, then max_features=sqrt(n_features).

If “sqrt”, then max_features=sqrt(n_features).

If “log2”, then max_features=log2(n_features).

If None, then max_features=n_features.

Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features.

max_depth

:int or None, optional (default=None)The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. Ignored if max_leaf_nodes is not None.

min_samples_split

:int, optional (default=2)The minimum number of samples required to split an internal node.

min_samples_leaf

:int, optional (default=1)The minimum number of samples required to be at a leaf node.

min_weight_fraction_leaf

:float, optional (default=0.)The minimum weighted fraction of the input samples required to be at a leaf node.

max_leaf_nodes

:int or None, optional (default=None)Grow a tree with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. If not None then max_depth will be ignored.

class_weight

:dict, list of dicts, “balanced” or None, optional (default=None)

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one. For multi-output problems, a list of dicts can be provided in the same order as the columns of y.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

For multi-output, the weights of each column of y will be multiplied.

Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

presort

:bool, optional (default=False)Whether to presort the data to speed up the finding of best splits in fitting. For the default settings of a decision tree on large datasets, setting this to true may slow down the training process. When using either a smaller dataset or a restricted depth, this may speed up the training.

Attributes

classes_: The classes labels (single output problem), or a list of arrays of class labels (multi-output problem).
feature_importances_: The feature importances. The higher, the more important the feature. The importance of a feature is computed as the (normalized) total reduction of the criterion brought by that feature. It is also known as the Gini importance [4]_.
max_features_: The inferred value of max_features.
n_classes_: The number of classes (for single output problems), or a list containing the number of classes for each output (for multi-output problems).
n_features_: The number of features when fit is performed.
n_outputs_: The number of outputs when fit is performed.
tree_: The underlying Tree object.

See also

DecisionTreeRegressor

References

[1]	http://en.wikipedia.org/wiki/Decision_tree_learning

[2]	L. Breiman, J. Friedman, R. Olshen, and C. Stone, “Classification and Regression Trees”, Wadsworth, Belmont, CA, 1984.

[3]	T. Hastie, R. Tibshirani and J. Friedman. “Elements of Statistical Learning”, Springer, 2009.

[4]	L. Breiman, and A. Cutler, “Random Forests”, http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

Examples

>>> from sklearn.datasets import load_iris
>>> from sklearn.cross_validation import cross_val_score
>>> from sklearn.tree import DecisionTreeClassifier
>>> clf = DecisionTreeClassifier(random_state=0)
>>> iris = load_iris()
>>> cross_val_score(clf, iris.data, iris.target, cv=10)
...                             
...
array([ 1.     ,  0.93...,  0.86...,  0.93...,  0.93...,
        0.93...,  0.93...,  1.     ,  0.93...,  1.      ])

POSSIBLE NODE NAMES:
	DecisionTreeClassifierSklearnNode DecisionTreeClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.DecisionTreeRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.DecisionTreeRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

A decision tree regressor.

This node has been automatically generated by wrapping the sklearn.tree.tree.DecisionTreeRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

criterion

:string, optional (default=”mse”)The function to measure the quality of a split. The only supported criterion is “mse” for the mean squared error, which is equal to variance reduction as feature selection criterion.

splitter

:string, optional (default=”best”)The strategy used to choose the split at each node. Supported strategies are “best” to choose the best split and “random” to choose the best random split.

max_features

:int, float, string or None, optional (default=None)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.

If float, then max_features is a percentage and

int(max_features * n_features) features are considered at each

split.

If “auto”, then max_features=n_features.

If “sqrt”, then max_features=sqrt(n_features).

If “log2”, then max_features=log2(n_features).

If None, then max_features=n_features.

Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features.

max_depth

:int or None, optional (default=None)The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. Ignored if max_leaf_nodes is not None.

min_samples_split

:int, optional (default=2)The minimum number of samples required to split an internal node.

min_samples_leaf

:int, optional (default=1)The minimum number of samples required to be at a leaf node.

min_weight_fraction_leaf

:float, optional (default=0.)The minimum weighted fraction of the input samples required to be at a leaf node.

max_leaf_nodes

:int or None, optional (default=None)Grow a tree with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. If not None then max_depth will be ignored.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

presort

:bool, optional (default=False)Whether to presort the data to speed up the finding of best splits in fitting. For the default settings of a decision tree on large datasets, setting this to true may slow down the training process. When using either a smaller dataset or a restricted depth, this may speed up the training.

Attributes

feature_importances_: The feature importances. The higher, the more important the feature. The importance of a feature is computed as the (normalized) total reduction of the criterion brought by that feature. It is also known as the Gini importance [4]_.
max_features_: The inferred value of max_features.
n_features_: The number of features when fit is performed.
n_outputs_: The number of outputs when fit is performed.
tree_: The underlying Tree object.

See also

DecisionTreeClassifier

References

[1]	http://en.wikipedia.org/wiki/Decision_tree_learning

[2]	L. Breiman, J. Friedman, R. Olshen, and C. Stone, “Classification and Regression Trees”, Wadsworth, Belmont, CA, 1984.

[3]	T. Hastie, R. Tibshirani and J. Friedman. “Elements of Statistical Learning”, Springer, 2009.

[4]	L. Breiman, and A. Cutler, “Random Forests”, http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

Examples

>>> from sklearn.datasets import load_boston
>>> from sklearn.cross_validation import cross_val_score
>>> from sklearn.tree import DecisionTreeRegressor
>>> boston = load_boston()
>>> regressor = DecisionTreeRegressor(random_state=0)
>>> cross_val_score(regressor, boston.data, boston.target, cv=10)
...                    
...
array([ 0.61..., 0.57..., -0.34..., 0.41..., 0.75...,
        0.07..., 0.29..., 0.33..., -1.42..., -1.77...])

POSSIBLE NODE NAMES:
	DecisionTreeRegressorSklearn DecisionTreeRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.DictVectorizerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.DictVectorizerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Transforms lists of feature-value mappings to vectors.

This node has been automatically generated by wrapping the sklearn.feature_extraction.dict_vectorizer.DictVectorizer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This transformer turns lists of mappings (dict-like objects) of feature names to feature values into Numpy arrays or scipy.sparse matrices for use with scikit-learn estimators.

When feature values are strings, this transformer will do a binary one-hot (aka one-of-K) coding: one boolean-valued feature is constructed for each of the possible string values that the feature can take on. For instance, a feature “f” that can take on the values “ham” and “spam” will become two features in the output, one signifying “f=ham”, the other “f=spam”.

Features that do not occur in a sample (mapping) will have a zero value in the resulting array/matrix.

Read more in the User Guide.

Parameters

dtype: The type of feature values. Passed to Numpy array/scipy.sparse matrix constructors as the dtype argument.
separator: string, optional: Separator string used when constructing new features for one-hot coding.
sparse: boolean, optional.: Whether transform should produce scipy.sparse matrices. True by default.
sort: boolean, optional.: Whether feature_names_ and vocabulary_ should be sorted when fitting. True by default.

Attributes

vocabulary_: A dictionary mapping feature names to feature indices.
feature_names_: A list of length n_features containing the feature names (e.g., “f=ham” and “f=spam”).

Examples

>>> from sklearn.feature_extraction import DictVectorizer
>>> v = DictVectorizer(sparse=False)
>>> D = [{'foo': 1, 'bar': 2}, {'foo': 3, 'baz': 1}]
>>> X = v.fit_transform(D)
>>> X
array([[ 2.,  0.,  1.],
       [ 0.,  1.,  3.]])
>>> v.inverse_transform(X) ==         [{'bar': 2.0, 'foo': 1.0}, {'baz': 1.0, 'foo': 3.0}]
True
>>> v.transform({'foo': 4, 'unseen_feature': 3})
array([[ 0.,  0.,  4.]])

See also

FeatureHasher : performs vectorization using only a hash function. sklearn.preprocessing.OneHotEncoder : handles nominal/categorical features

encoded as columns of integers.

POSSIBLE NODE NAMES:
	DictVectorizerTransformerSklearnNode DictVectorizerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.DictionaryLearningTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.DictionaryLearningTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Dictionary learning

This node has been automatically generated by wrapping the sklearn.decomposition.dict_learning.DictionaryLearning class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Finds a dictionary (a set of atoms) that can best be used to represent data using a sparse code.

Solves the optimization problem:

(U^*,V^*) = argmin 0.5 || Y - U V ||_2^2 + alpha * || U ||_1
            (U,V)
            with || V_k ||_2 = 1 for all  0 <= k < n_components

Read more in the User Guide.

Parameters

n_components: number of dictionary elements to extract
alpha: sparsity controlling parameter
max_iter: maximum number of iterations to perform
tol: tolerance for numerical error
fit_algorithm: lars: uses the least angle regression method to solve the lasso problem (linear_model.lars_path) cd: uses the coordinate descent method to compute the Lasso solution (linear_model.Lasso). Lars will be faster if the estimated components are sparse.

New in version 0.17: cd coordinate descent method to improve speed.
transform_algorithm: Algorithm used to transform the data lars: uses the least angle regression method (linear_model.lars_path) lasso_lars: uses Lars to compute the Lasso solution lasso_cd: uses the coordinate descent method to compute the Lasso solution (linear_model.Lasso). lasso_lars will be faster if the estimated components are sparse. omp: uses orthogonal matching pursuit to estimate the sparse solution threshold: squashes to zero all coefficients less than alpha from the projection dictionary * X'

New in version 0.17: lasso_cd coordinate descent method to improve speed.
transform_n_nonzero_coefs: Number of nonzero coefficients to target in each column of the solution. This is only used by algorithm=’lars’ and algorithm=’omp’ and is overridden by alpha in the omp case.
transform_alpha: If algorithm=’lasso_lars’ or algorithm=’lasso_cd’, alpha is the penalty applied to the L1 norm. If algorithm=’threshold’, alpha is the absolute value of the threshold below which coefficients will be squashed to zero. If algorithm=’omp’, alpha is the tolerance parameter: the value of the reconstruction error targeted. In this case, it overrides n_nonzero_coefs.
split_sign: Whether to split the sparse feature vector into the concatenation of its negative part and its positive part. This can improve the performance of downstream classifiers.
n_jobs: number of parallel jobs to run
code_init: initial value for the code, for warm restart
dict_init: initial values for the dictionary, for warm restart

verbose :

degree of verbosity of the printed output

random_state: Pseudo number generator state used for random sampling.

Attributes

components_: dictionary atoms extracted from the data
error_: vector of errors at each iteration
n_iter_: Number of iterations run.

Notes

References:

J. Mairal, F. Bach, J. Ponce, G. Sapiro, 2009: Online dictionary learning for sparse coding (http://www.di.ens.fr/sierra/pdfs/icml09.pdf)

See also

SparseCoder MiniBatchDictionaryLearning SparsePCA MiniBatchSparsePCA

POSSIBLE NODE NAMES:
	DictionaryLearningTransformerSklearnNode DictionaryLearningTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ElasticNetCVRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ElasticNetCVRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Elastic Net model with iterative fitting along a regularization path

This node has been automatically generated by wrapping the sklearn.linear_model.coordinate_descent.ElasticNetCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The best model is selected by cross-validation.

Read more in the User Guide.

Parameters

l1_ratio

:float or array of floats, optionalfloat between 0 and 1 passed to ElasticNet (scaling between l1 and l2 penalties). For l1_ratio = 0 the penalty is an L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2 This parameter can be a list, in which case the different values are tested by cross-validation and the one giving the best prediction score is used. Note that a good choice of list of values for l1_ratio is often to put more values close to 1 (i.e. Lasso) and less close to 0 (i.e. Ridge), as in

[.1, .5, .7,
.9, .95, .99, 1]

eps

:float, optionalLength of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

n_alphas

:int, optionalNumber of alphas along the regularization path, used for each l1_ratio.

alphas

:numpy array, optionalList of alphas where to compute the models. If None alphas are set automatically

precompute

:True | False | ‘auto’ | array-likeWhether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

max_iter

:int, optionalThe maximum number of iterations

tol

:float, optionalThe tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

verbose

:bool or integerAmount of verbosity.

n_jobs

:integer, optionalNumber of CPUs to use during the cross validation. If -1, use all the CPUs.

positive

:bool, optionalWhen set to True, forces the coefficients to be positive.

selection

:str, default ‘cyclic’If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to ‘random’) often leads to significantly faster convergence especially when tol is higher than 1e-4.

random_state

:int, RandomState instance, or None (default)The seed of the pseudo random number generator that selects a random feature to update. Useful only when selection is set to ‘random’.

fit_intercept

:booleanwhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

copy_X

:boolean, optional, default TrueIf True, X will be copied; else, it may be overwritten.

Attributes

alpha_: The amount of penalization chosen by cross validation
l1_ratio_: The compromise between l1 and l2 penalization chosen by cross validation
coef_: Parameter vector (w in the cost function formula),
intercept_: Independent term in the decision function.
mse_path_: Mean square error for the test set on each fold, varying l1_ratio and alpha.
alphas_: The grid of alphas used for fitting, for each l1_ratio.
n_iter_: number of iterations run by the coordinate descent solver to reach the specified tolerance for the optimal alpha.

Notes

See examples/linear_model/lasso_path_with_crossvalidation.py for an example.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

The parameter l1_ratio corresponds to alpha in the glmnet R package while alpha corresponds to the lambda parameter in glmnet. More specifically, the optimization objective is:

1 / (2 * n_samples) * ||y - Xw||^2_2 +
+ alpha * l1_ratio * ||w||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

If you are interested in controlling the L1 and L2 penalty separately, keep in mind that this is equivalent to:

a * L1 + b * L2

for:

alpha = a + b and l1_ratio = a / (a + b).

See also

enet_path ElasticNet

POSSIBLE NODE NAMES:
	ElasticNetCVRegressorSklearnNode ElasticNetCVRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ElasticNetRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ElasticNetRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Linear regression with combined L1 and L2 priors as regularizer.

This node has been automatically generated by wrapping the sklearn.linear_model.coordinate_descent.ElasticNet class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Minimizes the objective function:

1 / (2 * n_samples) * ||y - Xw||^2_2 +
+ alpha * l1_ratio * ||w||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

If you are interested in controlling the L1 and L2 penalty separately, keep in mind that this is equivalent to:

a * L1 + b * L2

where:

alpha = a + b and l1_ratio = a / (a + b)

The parameter l1_ratio corresponds to alpha in the glmnet R package while alpha corresponds to the lambda parameter in glmnet. Specifically, l1_ratio = 1 is the lasso penalty. Currently, l1_ratio <= 0.01 is not reliable, unless you supply your own sequence of alpha.

Read more in the User Guide.

Parameters

alpha: Constant that multiplies the penalty terms. Defaults to 1.0 See the notes for the exact mathematical meaning of this parameter. alpha = 0 is equivalent to an ordinary least square, solved by the LinearRegression object. For numerical reasons, using alpha = 0 with the Lasso object is not advised and you should prefer the LinearRegression object.
l1_ratio: The ElasticNet mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.
fit_intercept: Whether the intercept should be estimated or not. If False, the data is assumed to be already centered.
normalize: If True, the regressors X will be normalized before regression.
precompute: Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument. For sparse input this option is always True to preserve sparsity. WARNING : The 'auto' option is deprecated and will be removed in 0.18.
max_iter: The maximum number of iterations
copy_X: If True, X will be copied; else, it may be overwritten.
tol: The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.
warm_start: When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.
positive: When set to True, forces the coefficients to be positive.
selection: If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to ‘random’) often leads to significantly faster convergence especially when tol is higher than 1e-4.
random_state: The seed of the pseudo random number generator that selects a random feature to update. Useful only when selection is set to ‘random’.

Attributes

coef_: parameter vector (w in the cost function formula)
sparse_coef_: sparse_coef_ is a readonly property derived from coef_
intercept_: independent term in decision function.
n_iter_: number of iterations run by the coordinate descent solver to reach the specified tolerance.

Notes

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

See also

SGDRegressor: implements elastic net regression with incremental training. SGDClassifier: implements logistic regression with elastic net penalty

(SGDClassifier(loss="log", penalty="elasticnet")).

POSSIBLE NODE NAMES:
	ElasticNetRegressorSklearnNode ElasticNetRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ExtraTreeClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ExtraTreeClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

An extremely randomized tree classifier.

This node has been automatically generated by wrapping the sklearn.tree.tree.ExtraTreeClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Extra-trees differ from classic decision trees in the way they are built. When looking for the best split to separate the samples of a node into two groups, random splits are drawn for each of the max_features randomly selected features and the best split among those is chosen. When max_features is set 1, this amounts to building a totally random decision tree.

Warning: Extra-trees should only be used within ensemble methods.

Read more in the User Guide.

See also

ExtraTreeRegressor, ExtraTreesClassifier, ExtraTreesRegressor

References

[1]	P. Geurts, D. Ernst., and L. Wehenkel, “Extremely randomized trees”, Machine Learning, 63(1), 3-42, 2006.

POSSIBLE NODE NAMES:
	ExtraTreeClassifierSklearn ExtraTreeClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ExtraTreeRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ExtraTreeRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

An extremely randomized tree regressor.

This node has been automatically generated by wrapping the sklearn.tree.tree.ExtraTreeRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Extra-trees differ from classic decision trees in the way they are built. When looking for the best split to separate the samples of a node into two groups, random splits are drawn for each of the max_features randomly selected features and the best split among those is chosen. When max_features is set 1, this amounts to building a totally random decision tree.

Warning: Extra-trees should only be used within ensemble methods.

Read more in the User Guide.

See also

ExtraTreeClassifier, ExtraTreesClassifier, ExtraTreesRegressor

References

[1]	P. Geurts, D. Ernst., and L. Wehenkel, “Extremely randomized trees”, Machine Learning, 63(1), 3-42, 2006.

POSSIBLE NODE NAMES:
	ExtraTreeRegressorSklearn ExtraTreeRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ExtraTreesClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ExtraTreesClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

An extra-trees classifier.

This node has been automatically generated by wrapping the sklearn.ensemble.forest.ExtraTreesClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This class implements a meta estimator that fits a number of randomized decision trees (a.k.a. extra-trees) on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting.

Read more in the User Guide.

Parameters

n_estimators

:integer, optional (default=10)The number of trees in the forest.

criterion

:string, optional (default=”gini”)The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain. Note: this parameter is tree-specific.

max_features

:int, float, string or None, optional (default=”auto”)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.
If float, then max_features is a percentage and int(max_features * n_features) features are considered at each split.
If “auto”, then max_features=sqrt(n_features).
If “sqrt”, then max_features=sqrt(n_features).
If “log2”, then max_features=log2(n_features).
If None, then max_features=n_features.

Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Note: this parameter is tree-specific.

max_depth

:integer or None, optional (default=None)The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. Ignored if max_leaf_nodes is not None. Note: this parameter is tree-specific.

min_samples_split

:integer, optional (default=2)The minimum number of samples required to split an internal node. Note: this parameter is tree-specific.

min_samples_leaf

:integer, optional (default=1)The minimum number of samples in newly created leaves. A split is discarded if after the split, one of the leaves would contain less then min_samples_leaf samples. Note: this parameter is tree-specific.

min_weight_fraction_leaf

:float, optional (default=0.)The minimum weighted fraction of the input samples required to be at a leaf node. Note: this parameter is tree-specific.

max_leaf_nodes

:int or None, optional (default=None)Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. If not None then max_depth will be ignored. Note: this parameter is tree-specific.

bootstrap

:boolean, optional (default=False)Whether bootstrap samples are used when building trees.

oob_score

:boolWhether to use out-of-bag samples to estimate the generalization error.

n_jobs

:integer, optional (default=1)The number of jobs to run in parallel for both fit and predict. If -1, then the number of jobs is set to the number of cores.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

verbose

:int, optional (default=0)Controls the verbosity of the tree building process.

warm_start

:bool, optional (default=False)When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.

class_weight : dict, list of dicts, “balanced”, “balanced_subsample” or None, optional

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one. For multi-output problems, a list of dicts can be provided in the same order as the columns of y.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

The “balanced_subsample” mode is the same as “balanced” except that weights are computed based on the bootstrap sample for every tree grown.

For multi-output, the weights of each column of y will be multiplied.

Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

Attributes

estimators_: The collection of fitted sub-estimators.
classes_: The classes labels (single output problem), or a list of arrays of class labels (multi-output problem).
n_classes_: The number of classes (single output problem), or a list containing the number of classes for each output (multi-output problem).
feature_importances_: The feature importances (the higher, the more important the feature).
n_features_: The number of features when fit is performed.
n_outputs_: The number of outputs when fit is performed.
oob_score_: Score of the training dataset obtained using an out-of-bag estimate.
oob_decision_function_: Decision function computed with out-of-bag estimate on the training set. If n_estimators is small it might be possible that a data point was never left out during the bootstrap. In this case, oob_decision_function_ might contain NaN.

References

[1]	P. Geurts, D. Ernst., and L. Wehenkel, “Extremely randomized trees”, Machine Learning, 63(1), 3-42, 2006.

See also

sklearn.tree.ExtraTreeClassifier : Base classifier for this ensemble. RandomForestClassifier : Ensemble Classifier based on trees with optimal

splits.

POSSIBLE NODE NAMES:
	ExtraTreesClassifierSklearnNode ExtraTreesClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ExtraTreesRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ExtraTreesRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

An extra-trees regressor.

This node has been automatically generated by wrapping the sklearn.ensemble.forest.ExtraTreesRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This class implements a meta estimator that fits a number of randomized decision trees (a.k.a. extra-trees) on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting.

Read more in the User Guide.

Parameters

n_estimators

:integer, optional (default=10)The number of trees in the forest.

criterion

:string, optional (default=”mse”)The function to measure the quality of a split. The only supported criterion is “mse” for the mean squared error. Note: this parameter is tree-specific.

max_features

:int, float, string or None, optional (default=”auto”)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.
If float, then max_features is a percentage and int(max_features * n_features) features are considered at each split.
If “auto”, then max_features=n_features.
If “sqrt”, then max_features=sqrt(n_features).
If “log2”, then max_features=log2(n_features).
If None, then max_features=n_features.

Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Note: this parameter is tree-specific.

max_depth

:integer or None, optional (default=None)The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. Ignored if max_leaf_nodes is not None. Note: this parameter is tree-specific.

min_samples_split

:integer, optional (default=2)The minimum number of samples required to split an internal node. Note: this parameter is tree-specific.

min_samples_leaf

:integer, optional (default=1)The minimum number of samples in newly created leaves. A split is discarded if after the split, one of the leaves would contain less then min_samples_leaf samples. Note: this parameter is tree-specific.

min_weight_fraction_leaf

:float, optional (default=0.)The minimum weighted fraction of the input samples required to be at a leaf node. Note: this parameter is tree-specific.

max_leaf_nodes

:int or None, optional (default=None)Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. If not None then max_depth will be ignored. Note: this parameter is tree-specific.

bootstrap

:boolean, optional (default=False)Whether bootstrap samples are used when building trees. Note: this parameter is tree-specific.

oob_score

:boolWhether to use out-of-bag samples to estimate the generalization error.

n_jobs

:integer, optional (default=1)The number of jobs to run in parallel for both fit and predict. If -1, then the number of jobs is set to the number of cores.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

verbose

:int, optional (default=0)Controls the verbosity of the tree building process.

warm_start

:bool, optional (default=False)When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.

Attributes

estimators_: The collection of fitted sub-estimators.
feature_importances_: The feature importances (the higher, the more important the feature).
n_features_: The number of features.
n_outputs_: The number of outputs.
oob_score_: Score of the training dataset obtained using an out-of-bag estimate.
oob_prediction_: Prediction computed with out-of-bag estimate on the training set.

References

[1]	P. Geurts, D. Ernst., and L. Wehenkel, “Extremely randomized trees”, Machine Learning, 63(1), 3-42, 2006.

See also

sklearn.tree.ExtraTreeRegressor: Base estimator for this ensemble. RandomForestRegressor: Ensemble regressor using trees with optimal splits.

POSSIBLE NODE NAMES:
	ExtraTreesRegressorSklearnNode ExtraTreesRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.FactorAnalysisTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.FactorAnalysisTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Factor Analysis (FA)

This node has been automatically generated by wrapping the sklearn.decomposition.factor_analysis.FactorAnalysis class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A simple linear generative model with Gaussian latent variables.

The observations are assumed to be caused by a linear transformation of lower dimensional latent factors and added Gaussian noise. Without loss of generality the factors are distributed according to a Gaussian with zero mean and unit covariance. The noise is also zero mean and has an arbitrary diagonal covariance matrix.

If we would restrict the model further, by assuming that the Gaussian noise is even isotropic (all diagonal entries are the same) we would obtain PPCA.

FactorAnalysis performs a maximum likelihood estimate of the so-called loading matrix, the transformation of the latent variables to the observed ones, using expectation-maximization (EM).

Read more in the User Guide.

Parameters

n_components: Dimensionality of latent space, the number of components of X that are obtained after transform. If None, n_components is set to the number of features.
tol: Stopping tolerance for EM algorithm.
copy: Whether to make a copy of X. If False, the input X gets overwritten during fitting.
max_iter: Maximum number of iterations.
noise_variance_init: The initial guess of the noise variance for each feature. If None, it defaults to np.ones(n_features)
svd_method: Which SVD method to use. If ‘lapack’ use standard SVD from scipy.linalg, if ‘randomized’ use fast randomized_svd function. Defaults to ‘randomized’. For most applications ‘randomized’ will be sufficiently precise while providing significant speed gains. Accuracy can also be improved by setting higher values for iterated_power. If this is not sufficient, for maximum precision you should choose ‘lapack’.
iterated_power: Number of iterations for the power method. 3 by default. Only used if svd_method equals ‘randomized’
random_state: Pseudo number generator state used for random sampling. Only used if svd_method equals ‘randomized’

Attributes

components_: Components with maximum variance.
loglike_: The log likelihood at each iteration.
noise_variance_: The estimated noise variance for each feature.
n_iter_: Number of iterations run.

References

See also

PCA: Principal component analysis is also a latent linear variable model: which however assumes equal noise variance for each feature. This extra assumption makes probabilistic PCA faster as it can be computed in closed form.
FastICA: Independent component analysis, a latent variable model with: non-Gaussian latent variables.

POSSIBLE NODE NAMES:
	FactorAnalysisTransformerSklearn FactorAnalysisTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.FeatureAgglomerationTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.FeatureAgglomerationTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Agglomerate features.

This node has been automatically generated by wrapping the sklearn.cluster.hierarchical.FeatureAgglomeration class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Similar to AgglomerativeClustering, but recursively merges features instead of samples.

Read more in the User Guide.

Parameters

n_clusters

:int, default 2The number of clusters to find.

connectivity

:array-like or callable, optionalConnectivity matrix. Defines for each feature the neighboring features following a given structure of the data. This can be a connectivity matrix itself or a callable that transforms the data into a connectivity matrix, such as derived from kneighbors_graph. Default is None, i.e, the hierarchical clustering algorithm is unstructured.

affinity

:string or callable, default “euclidean”Metric used to compute the linkage. Can be “euclidean”, “l1”, “l2”, “manhattan”, “cosine”, or ‘precomputed’. If linkage is “ward”, only “euclidean” is accepted.

memory

:Instance of joblib.Memory or string, optionalUsed to cache the output of the computation of the tree. By default, no caching is done. If a string is given, it is the path to the caching directory.

n_components

:int (optional)Number of connected components. If None the number of connected components is estimated from the connectivity matrix. NOTE: This parameter is now directly determined from the connectivity matrix and will be removed in 0.18

compute_full_tree

:bool or ‘auto’, optional, default “auto”Stop early the construction of the tree at n_clusters. This is useful to decrease computation time if the number of clusters is not small compared to the number of features. This option is useful only when specifying a connectivity matrix. Note also that when varying the number of clusters and using caching, it may be advantageous to compute the full tree.

linkage

:{“ward”, “complete”, “average”}, optional, default “ward”

Which linkage criterion to use. The linkage criterion determines which distance to use between sets of features. The algorithm will merge the pairs of cluster that minimize this criterion.

ward minimizes the variance of the clusters being merged.
average uses the average of the distances of each feature of the two sets.
complete or maximum linkage uses the maximum distances between all features of the two sets.

pooling_func

:callable, default np.meanThis combines the values of agglomerated features into a single value, and should accept an array of shape [M, N] and the keyword argument axis=1, and reduce it to an array of size [M].

Attributes

labels_: cluster labels for each feature.
n_leaves_: Number of leaves in the hierarchical tree.
n_components_: The estimated number of connected components in the graph.
children_: The children of each non-leaf node. Values less than n_features correspond to leaves of the tree which are the original samples. A node i greater than or equal to n_features is a non-leaf node and has children children_[i - n_features]. Alternatively at the i-th iteration, children[i][0] and children[i][1] are merged to form node n_features + i

POSSIBLE NODE NAMES:
	FeatureAgglomerationTransformerSklearn FeatureAgglomerationTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.FeatureHasherTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.FeatureHasherTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Implements feature hashing, aka the hashing trick.

This node has been automatically generated by wrapping the sklearn.feature_extraction.hashing.FeatureHasher class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This class turns sequences of symbolic feature names (strings) into scipy.sparse matrices, using a hash function to compute the matrix column corresponding to a name. The hash function employed is the signed 32-bit version of Murmurhash3.

Feature names of type byte string are used as-is. Unicode strings are converted to UTF-8 first, but no Unicode normalization is done. Feature values must be (finite) numbers.

This class is a low-memory alternative to DictVectorizer and CountVectorizer, intended for large-scale (online) learning and situations where memory is tight, e.g. when running prediction code on embedded devices.

Read more in the User Guide.

Parameters

n_features: The number of features (columns) in the output matrices. Small numbers of features are likely to cause hash collisions, but large numbers will cause larger coefficient dimensions in linear learners.
dtype: The type of feature values. Passed to scipy.sparse matrix constructors as the dtype argument. Do not set this to bool, np.boolean or any unsigned integer type.
input_type: Either “dict” (the default) to accept dictionaries over (feature_name, value); “pair” to accept pairs of (feature_name, value); or “string” to accept single strings. feature_name should be a string, while value should be a number. In the case of “string”, a value of 1 is implied. The feature_name is hashed to find the appropriate column for the feature. The value’s sign might be flipped in the output (but see non_negative, below).
non_negative: Whether output matrices should contain non-negative values only; effectively calls abs on the matrix prior to returning it. When True, output values can be interpreted as frequencies. When False, output values will have expected value zero.

Examples

>>> from sklearn.feature_extraction import FeatureHasher
>>> h = FeatureHasher(n_features=10)
>>> D = [{'dog': 1, 'cat':2, 'elephant':4},{'dog': 2, 'run': 5}]
>>> f = h.transform(D)
>>> f.toarray()
array([[ 0.,  0., -4., -1.,  0.,  0.,  0.,  0.,  0.,  2.],
       [ 0.,  0.,  0., -2., -5.,  0.,  0.,  0.,  0.,  0.]])

See also

DictVectorizer : vectorizes string-valued features using a hash table. sklearn.preprocessing.OneHotEncoder : handles nominal/categorical features

encoded as columns of integers.

POSSIBLE NODE NAMES:
	FeatureHasherTransformerSklearnNode FeatureHasherTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ForestRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ForestRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Base class for forest of trees-based regressors.

This node has been automatically generated by wrapping the sklearn.ensemble.forest.ForestRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Warning: This class should not be used directly. Use derived classes instead.

POSSIBLE NODE NAMES:
	ForestRegressorSklearn ForestRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.FunctionTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.FunctionTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Constructs a transformer from an arbitrary callable.

This node has been automatically generated by wrapping the sklearn.preprocessing._function_transformer.FunctionTransformer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A FunctionTransformer forwards its X (and optionally y) arguments to a user-defined function or function object and returns the result of this function. This is useful for stateless transformations such as taking the log of frequencies, doing custom scaling, etc.

A FunctionTransformer will not do any checks on its function’s output.

Note: If a lambda is used as the function, then the resulting transformer will not be pickleable.

New in version 0.17.

Parameters

func: The callable to use for the transformation. This will be passed the same arguments as transform, with args and kwargs forwarded. If func is None, then func will be the identity function.
validate: Indicate that the input X array should be checked before calling func. If validate is false, there will be no input validation. If it is true, then X will be converted to a 2-dimensional NumPy array or sparse matrix. If this conversion is not possible or X contains NaN or infinity, an exception is raised.
accept_sparse: Indicate that func accepts a sparse matrix as input. If validate is False, this has no effect. Otherwise, if accept_sparse is false, sparse matrix inputs will cause an exception to be raised.
pass_y: bool, optional default=False: Indicate that transform should forward the y argument to the inner callable.

POSSIBLE NODE NAMES:
	FunctionTransformerSklearn FunctionTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.GaussianNBClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.GaussianNBClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Gaussian Naive Bayes (GaussianNB)

This node has been automatically generated by wrapping the sklearn.naive_bayes.GaussianNB class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Can perform online updates to model parameters via partial_fit method. For details on algorithm used to update feature means and variance online, see Stanford CS tech report STAN-CS-79-773 by Chan, Golub, and LeVeque:

http://i.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf

Read more in the User Guide.

Attributes

class_prior_: probability of each class.
class_count_: number of training samples observed in each class.
theta_: mean of each feature per class
sigma_: variance of each feature per class

Examples

>>> import numpy as np
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> Y = np.array([1, 1, 1, 2, 2, 2])
>>> from sklearn.naive_bayes import GaussianNB
>>> clf = GaussianNB()
>>> clf.fit(X, Y)
GaussianNB()
>>> print(clf.predict([[-0.8, -1]]))
[1]
>>> clf_pf = GaussianNB()
>>> clf_pf.partial_fit(X, Y, np.unique(Y))
GaussianNB()
>>> print(clf_pf.predict([[-0.8, -1]]))
[1]

POSSIBLE NODE NAMES:
	GaussianNBClassifierSklearn GaussianNBClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.GaussianProcessRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.GaussianProcessRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

The Gaussian Process model class.

This node has been automatically generated by wrapping the sklearn.gaussian_process.gaussian_process.GaussianProcess class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

regr

:string or callable, optional

A regression function returning an array of outputs of the linear regression functional basis. The number of observations n_samples should be greater than the size p of this basis. Default assumes a simple constant regression trend. Available built-in regression models are:

'constant', 'linear', 'quadratic'

corr

:string or callable, optional

A stationary autocorrelation function returning the autocorrelation between two points x and x’. Default assumes a squared-exponential autocorrelation model. Built-in correlation models are:

'absolute_exponential', 'squared_exponential',
'generalized_exponential', 'cubic', 'linear'

beta0

:double array_like, optionalThe regression weight vector to perform Ordinary Kriging (OK). Default assumes Universal Kriging (UK) so that the vector beta of regression weights is estimated using the maximum likelihood principle.

storage_mode

:string, optionalA string specifying whether the Cholesky decomposition of the correlation matrix should be stored in the class (storage_mode = ‘full’) or not (storage_mode = ‘light’). Default assumes storage_mode = ‘full’, so that the Cholesky decomposition of the correlation matrix is stored. This might be a useful parameter when one is not interested in the MSE and only plan to estimate the BLUP, for which the correlation matrix is not required.

verbose

:boolean, optionalA boolean specifying the verbose level. Default is verbose = False.

theta0

:double array_like, optionalAn array with shape (n_features, ) or (1, ). The parameters in the autocorrelation model. If thetaL and thetaU are also specified, theta0 is considered as the starting point for the maximum likelihood estimation of the best set of parameters. Default assumes isotropic autocorrelation model with theta0 = 1e-1.

thetaL

:double array_like, optionalAn array with shape matching theta0’s. Lower bound on the autocorrelation parameters for maximum likelihood estimation. Default is None, so that it skips maximum likelihood estimation and it uses theta0.

thetaU

:double array_like, optionalAn array with shape matching theta0’s. Upper bound on the autocorrelation parameters for maximum likelihood estimation. Default is None, so that it skips maximum likelihood estimation and it uses theta0.

normalize

:boolean, optionalInput X and observations y are centered and reduced wrt means and standard deviations estimated from the n_samples observations provided. Default is normalize = True so that data is normalized to ease maximum likelihood estimation.

nugget

:double or ndarray, optionalIntroduce a nugget effect to allow smooth predictions from noisy data. If nugget is an ndarray, it must be the same length as the number of data points used for the fit. The nugget is added to the diagonal of the assumed training covariance; in this way it acts as a Tikhonov regularization in the problem. In the special case of the squared exponential correlation function, the nugget mathematically represents the variance of the input values. Default assumes a nugget close to machine precision for the sake of robustness (nugget = 10. * MACHINE_EPSILON).

optimizer

:string, optional

A string specifying the optimization algorithm to be used. Default uses ‘fmin_cobyla’ algorithm from scipy.optimize. Available optimizers are:

'fmin_cobyla', 'Welch'

‘Welch’ optimizer is dued to Welch et al., see reference [WBSWM1992]. It consists in iterating over several one-dimensional optimizations instead of running one single multi-dimensional optimization.

random_start

:int, optionalThe number of times the Maximum Likelihood Estimation should be performed from a random starting point. The first MLE always uses the specified starting point (theta0), the next starting points are picked at random according to an exponential distribution (log-uniform on [thetaL, thetaU]). Default does not use random starting point (random_start = 1).

random_state: integer or numpy.RandomState, optional

The generator used to shuffle the sequence of coordinates of theta in the Welch optimizer. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

Attributes

theta_: Specified theta OR the best set of autocorrelation parameters (the sought maximizer of the reduced likelihood function).
reduced_likelihood_function_value_: The optimal reduced likelihood function value.

Examples

>>> import numpy as np
>>> from sklearn.gaussian_process import GaussianProcess
>>> X = np.array([[1., 3., 5., 6., 7., 8.]]).T
>>> y = (X * np.sin(X)).ravel()
>>> gp = GaussianProcess(theta0=0.1, thetaL=.001, thetaU=1.)
>>> gp.fit(X, y)                                      
GaussianProcess(beta0=None...
        ...

Notes

The presentation implementation is based on a translation of the DACE Matlab toolbox, see reference [NLNS2002].

References

[NLNS2002]

H.B. Nielsen, S.N. Lophaven, H. B. Nielsen and J. Sondergaard. DACE - A MATLAB Kriging Toolbox. (2002) http://www2.imm.dtu.dk/~hbn/dace/dace.pdf

[WBSWM1992]

W.J. Welch, R.J. Buck, J. Sacks, H.P. Wynn, T.J. Mitchell, and M.D. Morris (1992). Screening, predicting, and computer experiments. Technometrics, 34(1) 15–25. http://www.jstor.org/pss/1269548

POSSIBLE NODE NAMES:
	GaussianProcessRegressorSklearn GaussianProcessRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.GaussianRandomProjectionHashTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.GaussianRandomProjectionHashTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

This node has been automatically generated by wrapping the sklearn.neighbors.approximate.GaussianRandomProjectionHash class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

POSSIBLE NODE NAMES:
	GaussianRandomProjectionHashTransformerSklearn GaussianRandomProjectionHashTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.GaussianRandomProjectionTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.GaussianRandomProjectionTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Reduce dimensionality through Gaussian random projection

This node has been automatically generated by wrapping the sklearn.random_projection.GaussianRandomProjection class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The components of the random matrix are drawn from N(0, 1 / n_components).

Read more in the User Guide.

Parameters

n_components

:int or ‘auto’, optional (default = ‘auto’)

Dimensionality of the target projection space.

n_components can be automatically adjusted according to the number of samples in the dataset and the bound given by the Johnson-Lindenstrauss lemma. In that case the quality of the embedding is controlled by the eps parameter.

It should be noted that Johnson-Lindenstrauss lemma can yield very conservative estimated of the required number of components as it makes no assumption on the structure of the dataset.

eps

:strictly positive float, optional (default=0.1)

Parameter to control the quality of the embedding according to the Johnson-Lindenstrauss lemma when n_components is set to ‘auto’.

Smaller values lead to better embedding and higher number of dimensions (n_components) in the target projection space.

random_state

:integer, RandomState instance or None (default=None)Control the pseudo random number generator used to generate the matrix at fit time.

Attributes

n_component_: Concrete number of components computed when n_components=”auto”.
components_: Random matrix used for the projection.

See Also

SparseRandomProjection

POSSIBLE NODE NAMES:
	GaussianRandomProjectionTransformerSklearn GaussianRandomProjectionTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.GenericUnivariateSelectTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.GenericUnivariateSelectTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Univariate feature selector with configurable strategy.

This node has been automatically generated by wrapping the sklearn.feature_selection.univariate_selection.GenericUnivariateSelect class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

score_func: Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues).
mode: Feature selection mode.
param: Parameter of the corresponding mode.

Attributes

scores_: Scores of features.
pvalues_: p-values of feature scores.

See also

f_classif: ANOVA F-value between labe/feature for classification tasks. chi2: Chi-squared stats of non-negative features for classification tasks. f_regression: F-value between label/feature for regression tasks. SelectPercentile: Select features based on percentile of the highest scores. SelectKBest: Select features based on the k highest scores. SelectFpr: Select features based on a false positive rate test. SelectFdr: Select features based on an estimated false discovery rate. SelectFwe: Select features based on family-wise error rate.

POSSIBLE NODE NAMES:
	GenericUnivariateSelectTransformerSklearnNode GenericUnivariateSelectTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.GradientBoostingClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.GradientBoostingClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Gradient Boosting for classification.

This node has been automatically generated by wrapping the sklearn.ensemble.gradient_boosting.GradientBoostingClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

GB builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the binomial or multinomial deviance loss function. Binary classification is a special case where only a single regression tree is induced.

Read more in the User Guide.

Parameters

loss

:{‘deviance’, ‘exponential’}, optional (default=’deviance’)loss function to be optimized. ‘deviance’ refers to deviance (= logistic regression) for classification with probabilistic outputs. For loss ‘exponential’ gradient boosting recovers the AdaBoost algorithm.

learning_rate

:float, optional (default=0.1)learning rate shrinks the contribution of each tree by learning_rate. There is a trade-off between learning_rate and n_estimators.

n_estimators

:int (default=100)The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance.

max_depth

:integer, optional (default=3)maximum depth of the individual regression estimators. The maximum depth limits the number of nodes in the tree. Tune this parameter for best performance; the best value depends on the interaction of the input variables. Ignored if max_leaf_nodes is not None.

min_samples_split

:integer, optional (default=2)The minimum number of samples required to split an internal node.

min_samples_leaf

:integer, optional (default=1)The minimum number of samples required to be at a leaf node.

min_weight_fraction_leaf

:float, optional (default=0.)The minimum weighted fraction of the input samples required to be at a leaf node.

subsample

:float, optional (default=1.0)The fraction of samples to be used for fitting the individual base learners. If smaller than 1.0 this results in Stochastic Gradient Boosting. subsample interacts with the parameter n_estimators. Choosing subsample < 1.0 leads to a reduction of variance and an increase in bias.

max_features

:int, float, string or None, optional (default=None)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.

If float, then max_features is a percentage and

int(max_features * n_features) features are considered at each

split.

If “auto”, then max_features=sqrt(n_features).

If “sqrt”, then max_features=sqrt(n_features).

If “log2”, then max_features=log2(n_features).

If None, then max_features=n_features.

Choosing max_features < n_features leads to a reduction of variance and an increase in bias.

Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features.

max_leaf_nodes

:int or None, optional (default=None)Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. If not None then max_depth will be ignored.

init

:BaseEstimator, None, optional (default=None)An estimator object that is used to compute the initial predictions. init has to provide fit and predict. If None it uses loss.init_estimator.

verbose

:int, default: 0Enable verbose output. If 1 then it prints progress and performance once in a while (the more trees the lower the frequency). If greater than 1 then it prints progress and performance for every tree.

warm_start

:bool, default: FalseWhen set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just erase the previous solution.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

presort

:bool or ‘auto’, optional (default=’auto’)

Whether to presort the data to speed up the finding of best splits in fitting. Auto mode by default will use presorting on dense data and default to normal sorting on sparse data. Setting presort to true on sparse data will raise an error.

New in version 0.17: presort parameter.

Attributes

feature_importances_: The feature importances (the higher, the more important the feature).
oob_improvement_: The improvement in loss (= deviance) on the out-of-bag samples relative to the previous iteration. oob_improvement_[0] is the improvement in loss of the first stage over the init estimator.
train_score_: The i-th score train_score_[i] is the deviance (= loss) of the model at iteration i on the in-bag sample. If subsample == 1 this is the deviance on the training data.
loss_: The concrete LossFunction object.
init: The estimator that provides the initial predictions. Set via the init argument or loss.init_estimator.
estimators_: The collection of fitted sub-estimators. loss_.K is 1 for binary classification, otherwise n_classes.

See also

sklearn.tree.DecisionTreeClassifier, RandomForestClassifier AdaBoostClassifier

References

J. Friedman, Greedy Function Approximation: A Gradient Boosting Machine, The Annals of Statistics, Vol. 29, No. 5, 2001.

Friedman, Stochastic Gradient Boosting, 1999

T. Hastie, R. Tibshirani and J. Friedman. Elements of Statistical Learning Ed. 2, Springer, 2009.

POSSIBLE NODE NAMES:
	GradientBoostingClassifierSklearn GradientBoostingClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.GradientBoostingRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.GradientBoostingRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Gradient Boosting for regression.

This node has been automatically generated by wrapping the sklearn.ensemble.gradient_boosting.GradientBoostingRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

GB builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

Read more in the User Guide.

Parameters

loss

:{‘ls’, ‘lad’, ‘huber’, ‘quantile’}, optional (default=’ls’)loss function to be optimized. ‘ls’ refers to least squares regression. ‘lad’ (least absolute deviation) is a highly robust loss function solely based on order information of the input variables. ‘huber’ is a combination of the two. ‘quantile’ allows quantile regression (use alpha to specify the quantile).

learning_rate

:float, optional (default=0.1)learning rate shrinks the contribution of each tree by learning_rate. There is a trade-off between learning_rate and n_estimators.

n_estimators

:int (default=100)The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance.

max_depth

:integer, optional (default=3)maximum depth of the individual regression estimators. The maximum depth limits the number of nodes in the tree. Tune this parameter for best performance; the best value depends on the interaction of the input variables. Ignored if max_leaf_nodes is not None.

min_samples_split

:integer, optional (default=2)The minimum number of samples required to split an internal node.

min_samples_leaf

:integer, optional (default=1)The minimum number of samples required to be at a leaf node.

min_weight_fraction_leaf

:float, optional (default=0.)The minimum weighted fraction of the input samples required to be at a leaf node.

subsample

:float, optional (default=1.0)The fraction of samples to be used for fitting the individual base learners. If smaller than 1.0 this results in Stochastic Gradient Boosting. subsample interacts with the parameter n_estimators. Choosing subsample < 1.0 leads to a reduction of variance and an increase in bias.

max_features

:int, float, string or None, optional (default=None)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.

If float, then max_features is a percentage and

int(max_features * n_features) features are considered at each

split.

If “auto”, then max_features=n_features.

If “sqrt”, then max_features=sqrt(n_features).

If “log2”, then max_features=log2(n_features).

If None, then max_features=n_features.

Choosing max_features < n_features leads to a reduction of variance and an increase in bias.

Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features.

max_leaf_nodes

:int or None, optional (default=None)Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes.

alpha

:float (default=0.9)The alpha-quantile of the huber loss function and the quantile loss function. Only if loss='huber' or loss='quantile'.

init

:BaseEstimator, None, optional (default=None)An estimator object that is used to compute the initial predictions. init has to provide fit and predict. If None it uses loss.init_estimator.

verbose

:int, default: 0Enable verbose output. If 1 then it prints progress and performance once in a while (the more trees the lower the frequency). If greater than 1 then it prints progress and performance for every tree.

warm_start

:bool, default: FalseWhen set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just erase the previous solution.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

presort

:bool or ‘auto’, optional (default=’auto’)

Whether to presort the data to speed up the finding of best splits in fitting. Auto mode by default will use presorting on dense data and default to normal sorting on sparse data. Setting presort to true on sparse data will raise an error.

New in version 0.17: optional parameter presort.

Attributes

feature_importances_: The feature importances (the higher, the more important the feature).
oob_improvement_: The improvement in loss (= deviance) on the out-of-bag samples relative to the previous iteration. oob_improvement_[0] is the improvement in loss of the first stage over the init estimator.
train_score_: The i-th score train_score_[i] is the deviance (= loss) of the model at iteration i on the in-bag sample. If subsample == 1 this is the deviance on the training data.
loss_: The concrete LossFunction object.
init: The estimator that provides the initial predictions. Set via the init argument or loss.init_estimator.
estimators_: The collection of fitted sub-estimators.

See also

DecisionTreeRegressor, RandomForestRegressor

References

J. Friedman, Greedy Function Approximation: A Gradient Boosting Machine, The Annals of Statistics, Vol. 29, No. 5, 2001.

Friedman, Stochastic Gradient Boosting, 1999

T. Hastie, R. Tibshirani and J. Friedman. Elements of Statistical Learning Ed. 2, Springer, 2009.

POSSIBLE NODE NAMES:
	GradientBoostingRegressorSklearn GradientBoostingRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.GridSearchCVTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.GridSearchCVTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Exhaustive search over specified parameter values for an estimator.

This node has been automatically generated by wrapping the sklearn.grid_search.GridSearchCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Important members are fit, predict.

GridSearchCV implements a “fit” and a “score” method. It also implements “predict”, “predict_proba”, “decision_function”, “transform” and “inverse_transform” if they are implemented in the estimator used.

The parameters of the estimator used to apply these methods are optimized by cross-validated grid-search over a parameter grid.

Read more in the User Guide.

Parameters

estimator

:estimator object.A object of that type is instantiated for each grid point. This is assumed to implement the scikit-learn estimator interface. Either estimator needs to provide a score function, or scoring must be passed.

param_grid

:dict or list of dictionariesDictionary with parameters names (string) as keys and lists of parameter settings to try as values, or a list of such dictionaries, in which case the grids spanned by each dictionary in the list are explored. This enables searching over any sequence of parameter settings.

scoring

:string, callable or None, default=NoneA string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y). If None, the score method of the estimator is used.

fit_params

:dict, optionalParameters to pass to the fit method.

n_jobs

:int, default=1

Number of jobs to run in parallel.

Changed in version 0.17: Upgraded to joblib 0.9.3.

pre_dispatch

:int, or string, optional

Controls the number of jobs that get dispatched during parallel execution. Reducing this number can be useful to avoid an explosion of memory consumption when more jobs get dispatched than CPUs can process. This parameter can be:

None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs

An int, giving the exact number of total jobs that are spawned

A string, giving an expression as a function of n_jobs, as in ‘2*n_jobs’

iid

:boolean, default=TrueIf True, the data is assumed to be identically distributed across the folds, and the loss minimized is the total loss per sample, and not the mean loss across the folds.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, if y is binary or multiclass, StratifiedKFold used. If the estimator is a classifier or if y is neither binary nor multiclass, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

refit

:boolean, default=TrueRefit the best estimator with the entire dataset. If “False”, it is impossible to make predictions using this GridSearchCV instance after fitting.

verbose

:integerControls the verbosity: the higher, the more messages.

error_score

:‘raise’ (default) or numericValue to assign to the score if an error occurs in estimator fitting. If set to ‘raise’, the error is raised. If a numeric value is given, FitFailedWarning is raised. This parameter does not affect the refit step, which will always raise the error.

Examples

>>> from sklearn import svm, grid_search, datasets
>>> iris = datasets.load_iris()
>>> parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
>>> svr = svm.SVC()
>>> clf = grid_search.GridSearchCV(svr, parameters)
>>> clf.fit(iris.data, iris.target)
...                             
GridSearchCV(cv=None, error_score=...,
       estimator=SVC(C=1.0, cache_size=..., class_weight=..., coef0=...,
                     decision_function_shape=None, degree=..., gamma=...,
                     kernel='rbf', max_iter=-1, probability=False,
                     random_state=None, shrinking=True, tol=...,
                     verbose=False),
       fit_params={}, iid=..., n_jobs=1,
       param_grid=..., pre_dispatch=..., refit=...,
       scoring=..., verbose=...)

Attributes

grid_scores_

:list of named tuples

Contains scores for all parameter combinations in param_grid. Each entry corresponds to one parameter setting. Each named tuple has the attributes:

parameters, a dict of parameter settings

mean_validation_score, the mean score over the cross-validation folds

cv_validation_scores, the list of scores for each fold

best_estimator_

:estimatorEstimator that was chosen by the search, i.e. estimator which gave highest score (or smallest loss if specified) on the left out data. Not available if refit=False.

best_score_

:floatScore of best_estimator on the left out data.

best_params_

:dictParameter setting that gave the best results on the hold out data.

scorer_

:functionScorer function used on the held out data to choose the best parameters for the model.

Notes

The parameters selected are those that maximize the score of the left out data, unless an explicit score is passed in which case it is used instead.

If n_jobs was set to a value higher than one, the data is copied for each point in the grid (and not n_jobs times). This is done for efficiency reasons if individual jobs take very little time, but may raise errors if the dataset is large and not enough memory is available. A workaround in this case is to set pre_dispatch. Then, the memory is copied only pre_dispatch many times. A reasonable value for pre_dispatch is 2 * n_jobs.

See Also

ParameterGrid:

generates all the combinations of a an hyperparameter grid.

sklearn.cross_validation.train_test_split():

utility function to split the data into a development set usable

for fitting a GridSearchCV instance and an evaluation set for

its final evaluation.

sklearn.metrics.make_scorer():

Make a scorer from a performance metric or loss function.

POSSIBLE NODE NAMES:
	GridSearchCVTransformerSklearn GridSearchCVTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.HashingVectorizerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.HashingVectorizerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Convert a collection of text documents to a matrix of token occurrences

This node has been automatically generated by wrapping the sklearn.feature_extraction.text.HashingVectorizer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

It turns a collection of text documents into a scipy.sparse matrix holding token occurrence counts (or binary occurrence information), possibly normalized as token frequencies if norm=’l1’ or projected on the euclidean unit sphere if norm=’l2’.

This text vectorizer implementation uses the hashing trick to find the token string name to feature integer index mapping.

This strategy has several advantages:

it is very low memory scalable to large datasets as there is no need to store a vocabulary dictionary in memory
it is fast to pickle and un-pickle as it holds no state besides the constructor parameters
it can be used in a streaming (partial fit) or parallel pipeline as there is no state computed during fit.

There are also a couple of cons (vs using a CountVectorizer with an in-memory vocabulary):

there is no way to compute the inverse transform (from feature indices to string feature names) which can be a problem when trying to introspect which features are most important to a model.
there can be collisions: distinct tokens can be mapped to the same feature index. However in practice this is rarely an issue if n_features is large enough (e.g. 2 ** 18 for text classification problems).
no IDF weighting as this would render the transformer stateful.

The hash function employed is the signed 32-bit version of Murmurhash3.

Read more in the User Guide.

Parameters

input

:string {‘filename’, ‘file’, ‘content’}

If ‘filename’, the sequence passed as an argument to fit is expected to be a list of filenames that need reading to fetch the raw content to analyze.

If ‘file’, the sequence items must have a ‘read’ method (file-like object) that is called to fetch the bytes in memory.

Otherwise the input is expected to be the sequence strings or bytes items are expected to be analyzed directly.

encoding

:string, default=’utf-8’If bytes or files are given to analyze, this encoding is used to decode.

decode_error

:{‘strict’, ‘ignore’, ‘replace’}Instruction on what to do if a byte sequence is given to analyze that contains characters not of the given encoding. By default, it is ‘strict’, meaning that a UnicodeDecodeError will be raised. Other values are ‘ignore’ and ‘replace’.

strip_accents

:{‘ascii’, ‘unicode’, None}Remove accents during the preprocessing step. ‘ascii’ is a fast method that only works on characters that have an direct ASCII mapping. ‘unicode’ is a slightly slower method that works on any characters. None (default) does nothing.

analyzer

:string, {‘word’, ‘char’, ‘char_wb’} or callable

Whether the feature should be made of word or character n-grams. Option ‘char_wb’ creates character n-grams only from text inside word boundaries.

If a callable is passed it is used to extract the sequence of features out of the raw, unprocessed input.

preprocessor

:callable or None (default)Override the preprocessing (string transformation) stage while preserving the tokenizing and n-grams generation steps.

tokenizer

:callable or None (default)Override the string tokenization step while preserving the preprocessing and n-grams generation steps. Only applies if analyzer == 'word'.

ngram_range

:tuple (min_n, max_n), default=(1, 1)The lower and upper boundary of the range of n-values for different n-grams to be extracted. All values of n such that min_n <= n <= max_n will be used.

stop_words

:string {‘english’}, list, or None (default)

If ‘english’, a built-in stop word list for English is used.

If a list, that list is assumed to contain stop words, all of which will be removed from the resulting tokens. Only applies if analyzer == 'word'.

lowercase

:boolean, default=TrueConvert all characters to lowercase before tokenizing.

token_pattern

:stringRegular expression denoting what constitutes a “token”, only used if analyzer == 'word'. The default regexp selects tokens of 2 or more alphanumeric characters (punctuation is completely ignored and always treated as a token separator).

n_features

:integer, default=(2 ** 20)The number of features (columns) in the output matrices. Small numbers of features are likely to cause hash collisions, but large numbers will cause larger coefficient dimensions in linear learners.

norm

:‘l1’, ‘l2’ or None, optionalNorm used to normalize term vectors. None for no normalization.

binary: boolean, default=False.

If True, all non zero counts are set to 1. This is useful for discrete probabilistic models that model binary events rather than integer counts.

dtype: type, optional

Type of the matrix returned by fit_transform() or transform().

non_negative

:boolean, default=FalseWhether output matrices should contain non-negative values only; effectively calls abs on the matrix prior to returning it. When True, output values can be interpreted as frequencies. When False, output values will have expected value zero.

See also

CountVectorizer, TfidfVectorizer

POSSIBLE NODE NAMES:
	HashingVectorizerTransformerSklearn HashingVectorizerTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ImputerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ImputerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Imputation transformer for completing missing values.

This node has been automatically generated by wrapping the sklearn.preprocessing.imputation.Imputer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

missing_values

:integer or “NaN”, optional (default=”NaN”)The placeholder for the missing values. All occurrences of missing_values will be imputed. For missing values encoded as np.nan, use the string value “NaN”.

strategy

:string, optional (default=”mean”)

The imputation strategy.

If “mean”, then replace missing values using the mean along the axis.
If “median”, then replace missing values using the median along the axis.
If “most_frequent”, then replace missing using the most frequent value along the axis.

axis

:integer, optional (default=0)

The axis along which to impute.

If axis=0, then impute along columns.
If axis=1, then impute along rows.

verbose

:integer, optional (default=0)Controls the verbosity of the imputer.

copy

:boolean, optional (default=True)

If True, a copy of X will be created. If False, imputation will be done in-place whenever possible. Note that, in the following cases, a new copy will always be made, even if copy=False:

If X is not an array of floating values;
If X is sparse and missing_values=0;
If axis=0 and X is encoded as a CSR matrix;
If axis=1 and X is encoded as a CSC matrix.

Attributes

statistics_: The imputation fill value for each feature if axis == 0.

Notes

When axis=0, columns which only contained missing values at fit are discarded upon transform.
When axis=1, an exception is raised if there are rows for which it is not possible to fill in the missing values (e.g., because they only contain missing values).

POSSIBLE NODE NAMES:
	ImputerTransformerSklearnNode ImputerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.IncrementalPCATransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.IncrementalPCATransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Incremental principal components analysis (IPCA).

This node has been automatically generated by wrapping the sklearn.decomposition.incremental_pca.IncrementalPCA class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Linear dimensionality reduction using Singular Value Decomposition of centered data, keeping only the most significant singular vectors to project the data to a lower dimensional space.

Depending on the size of the input data, this algorithm can be much more memory efficient than a PCA.

This algorithm has constant memory complexity, on the order of batch_size, enabling use of np.memmap files without loading the entire file into memory.

The computational overhead of each SVD is O(batch_size * n_features ** 2), but only 2 * batch_size samples remain in memory at a time. There will be n_samples / batch_size SVD computations to get the principal components, versus 1 large SVD of complexity O(n_samples * n_features ** 2) for PCA.

Read more in the User Guide.

Parameters

n_components

:int or None, (default=None)Number of components to keep. If n_components `` is ``None, then n_components is set to min(n_samples, n_features).

batch_size

:int or None, (default=None)The number of samples to use for each batch. Only used when calling fit. If batch_size is None, then batch_size is inferred from the data and set to 5 * n_features, to provide a balance between approximation accuracy and memory consumption.

copy

:bool, (default=True)If False, X will be overwritten. copy=False can be used to save memory but is unsafe for general use.

whiten

:bool, optional

When True (False by default) the components_ vectors are divided by n_samples times components_ to ensure uncorrelated outputs with unit component-wise variances.

Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometimes improve the predictive accuracy of the downstream estimators by making data respect some hard-wired assumptions.

Attributes

components_: Components with maximum variance.
explained_variance_: Variance explained by each of the selected components.
explained_variance_ratio_: Percentage of variance explained by each of the selected components. If all components are stored, the sum of explained variances is equal to 1.0
mean_: Per-feature empirical mean, aggregate over calls to partial_fit.
var_: Per-feature empirical variance, aggregate over calls to partial_fit.
noise_variance_: The estimated noise covariance following the Probabilistic PCA model from Tipping and Bishop 1999. See “Pattern Recognition and Machine Learning” by C. Bishop, 12.2.1 p. 574 or http://www.miketipping.com/papers/met-mppca.pdf.
n_components_: The estimated number of components. Relevant when n_components=None.
n_samples_seen_: The number of samples processed by the estimator. Will be reset on new calls to fit, but increments across partial_fit calls.

Notes

Implements the incremental PCA model from:

D. Ross, J. Lim, R. Lin, M. Yang, Incremental Learning for Robust Visual Tracking, International Journal of Computer Vision, Volume 77, Issue 1-3, pp. 125-141, May 2008. See http://www.cs.toronto.edu/~dross/ivt/RossLimLinYang_ijcv.pdf

This model is an extension of the Sequential Karhunen-Loeve Transform from:

A. Levy and M. Lindenbaum, Sequential Karhunen-Loeve Basis Extraction and its Application to Images, IEEE Transactions on Image Processing, Volume 9, Number 8, pp. 1371-1374, August 2000. See http://www.cs.technion.ac.il/~mic/doc/skl-ip.pdf

We have specifically abstained from an optimization used by authors of both papers, a QR decomposition used in specific situations to reduce the algorithmic complexity of the SVD. The source for this technique is Matrix Computations, Third Edition, G. Holub and C. Van Loan, Chapter 5, section 5.4.4, pp 252-253.. This technique has been omitted because it is advantageous only when decomposing a matrix with n_samples (rows) >= 5/3 * n_features (columns), and hurts the readability of the implemented algorithm. This would be a good opportunity for future optimization, if it is deemed necessary.

References

Ross, J. Lim, R. Lin, M. Yang. Incremental Learning for Robust Visual

Tracking, International Journal of Computer Vision, Volume 77, Issue 1-3, pp. 125-141, May 2008.

Golub and C. Van Loan. Matrix Computations, Third Edition, Chapter 5,

Section 5.4.4, pp. 252-253.

See also

PCA RandomizedPCA KernelPCA SparsePCA TruncatedSVD

POSSIBLE NODE NAMES:
	IncrementalPCATransformerSklearnNode IncrementalPCATransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.IsomapTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.IsomapTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Isomap Embedding

This node has been automatically generated by wrapping the sklearn.manifold.isomap.Isomap class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Non-linear dimensionality reduction through Isometric Mapping

Read more in the User Guide.

Parameters

n_neighbors

:integernumber of neighbors to consider for each point.

n_components

:integernumber of coordinates for the manifold

eigen_solver

:[‘auto’|’arpack’|’dense’]

‘auto’ : Attempt to choose the most efficient solver for the given problem.

‘arpack’ : Use Arnoldi decomposition to find the eigenvalues and eigenvectors.

‘dense’ : Use a direct solver (i.e. LAPACK) for the eigenvalue decomposition.

tol

:floatConvergence tolerance passed to arpack or lobpcg. not used if eigen_solver == ‘dense’.

max_iter

:integerMaximum number of iterations for the arpack solver. not used if eigen_solver == ‘dense’.

path_method

:string [‘auto’|’FW’|’D’]

Method to use in finding shortest path.

‘auto’ : attempt to choose the best algorithm automatically.

‘FW’ : Floyd-Warshall algorithm.

‘D’ : Dijkstra’s algorithm.

neighbors_algorithm

:string [‘auto’|’brute’|’kd_tree’|’ball_tree’]Algorithm to use for nearest neighbors search, passed to neighbors.NearestNeighbors instance.

Attributes

embedding_: Stores the embedding vectors.
kernel_pca_: KernelPCA object used to implement the embedding.
training_data_: Stores the training data.
nbrs_: Stores nearest neighbors instance, including BallTree or KDtree if applicable.
dist_matrix_: Stores the geodesic distance matrix of training data.

References

[1]	Tenenbaum, J.B.; De Silva, V.; & Langford, J.C. A global geometric framework for nonlinear dimensionality reduction. Science 290 (5500)

POSSIBLE NODE NAMES:
	IsomapTransformerSklearn IsomapTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.IsotonicRegressionSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.IsotonicRegressionSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Isotonic regression model.

This node has been automatically generated by wrapping the sklearn.isotonic.IsotonicRegression class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The isotonic regression optimization problem is defined by:

min sum w_i (y[i] - y_[i]) ** 2

subject to y_[i] <= y_[j] whenever X[i] <= X[j]
and min(y_) = y_min, max(y_) = y_max

where:

y[i] are inputs (real numbers)

y_[i] are fitted

X specifies the order.

If X is non-decreasing then y_ is non-decreasing.

w[i] are optional strictly positive weights (default to 1.0)

Read more in the User Guide.

Parameters

y_min

:optional, default: NoneIf not None, set the lowest value of the fit to y_min.

y_max

:optional, default: NoneIf not None, set the highest value of the fit to y_max.

increasing

:boolean or string, optional, default: True

If boolean, whether or not to fit the isotonic regression with y increasing or decreasing.

The string value “auto” determines whether y should increase or decrease based on the Spearman correlation estimate’s sign.

out_of_bounds

:string, optional, default: “nan”The out_of_bounds parameter handles how x-values outside of the training domain are handled. When set to “nan”, predicted y-values will be NaN. When set to “clip”, predicted y-values will be set to the value corresponding to the nearest train interval endpoint. When set to “raise”, allow interp1d to throw ValueError.

Attributes

X_: A copy of the input X.
y_: Isotonic fit of y.
X_min_: Minimum value of input array X_ for left bound.
X_max_: Maximum value of input array X_ for right bound.
f_: The stepwise interpolating function that covers the domain X_.

Notes

Ties are broken using the secondary method from Leeuw, 1977.

References

Isotonic Median Regression: A Linear Programming Approach Nilotpal Chakravarti Mathematics of Operations Research Vol. 14, No. 2 (May, 1989), pp. 303-308

Isotone Optimization in R : Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods Leeuw, Hornik, Mair Journal of Statistical Software 2009

Correctness of Kruskal’s algorithms for monotone regression with ties Leeuw, Psychometrica, 1977

POSSIBLE NODE NAMES:
	IsotonicRegressionSklearn IsotonicRegressionSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.KNeighborsClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.KNeighborsClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Classifier implementing the k-nearest neighbors vote.

This node has been automatically generated by wrapping the sklearn.neighbors.classification.KNeighborsClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

n_neighbors

:int, optional (default = 5)Number of neighbors to use by default for k_neighbors() queries.

weights

:str or callable

weight function used in prediction. Possible values:

‘uniform’ : uniform weights. All points in each neighborhood are weighted equally.
‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away.
[callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights.

Uniform weights are used by default.

algorithm

:{‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, optional

Algorithm used to compute the nearest neighbors:

‘ball_tree’ will use BallTree
‘kd_tree’ will use KDTree
‘brute’ will use a brute-force search.
‘auto’ will attempt to decide the most appropriate algorithm based on the values passed to fit() method.

Note: fitting on sparse input will override the setting of this parameter, using brute force.

leaf_size

:int, optional (default = 30)Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem.

metric

:string or DistanceMetric object (default = ‘minkowski’)the distance metric to use for the tree. The default metric is minkowski, and with p=2 is equivalent to the standard Euclidean metric. See the documentation of the DistanceMetric class for a list of available metrics.

p

:integer, optional (default = 2)Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.

metric_params

:dict, optional (default = None)Additional keyword arguments for the metric function.

n_jobs

:int, optional (default = 1)The number of parallel jobs to run for neighbors search. If -1, then the number of jobs is set to the number of CPU cores. Doesn’t affect fit() method.

Examples

>>> X = [[0], [1], [2], [3]]
>>> y = [0, 0, 1, 1]
>>> from sklearn.neighbors import KNeighborsClassifier
>>> neigh = KNeighborsClassifier(n_neighbors=3)
>>> neigh.fit(X, y) 
KNeighborsClassifier(...)
>>> print(neigh.predict([[1.1]]))
[0]
>>> print(neigh.predict_proba([[0.9]]))
[[ 0.66666667  0.33333333]]

See also

RadiusNeighborsClassifier KNeighborsRegressor RadiusNeighborsRegressor NearestNeighbors

Notes

See Nearest Neighbors in the online documentation for a discussion of the choice of algorithm and leaf_size.

Warning

Regarding the Nearest Neighbors algorithms, if it is found that two neighbors, neighbor k+1 and k, have identical distances but but different labels, the results will depend on the ordering of the training data.

http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

POSSIBLE NODE NAMES:
	KNeighborsClassifierSklearnNode KNeighborsClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.KNeighborsRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.KNeighborsRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Regression based on k-nearest neighbors.

This node has been automatically generated by wrapping the sklearn.neighbors.regression.KNeighborsRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.

Read more in the User Guide.

Parameters

n_neighbors

:int, optional (default = 5)Number of neighbors to use by default for k_neighbors() queries.

weights

:str or callable

weight function used in prediction. Possible values:

‘uniform’ : uniform weights. All points in each neighborhood are weighted equally.
‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away.
[callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights.

Uniform weights are used by default.

algorithm

:{‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, optional

Algorithm used to compute the nearest neighbors:

‘ball_tree’ will use BallTree
‘kd_tree’ will use KDtree
‘brute’ will use a brute-force search.
‘auto’ will attempt to decide the most appropriate algorithm based on the values passed to fit() method.

Note: fitting on sparse input will override the setting of this parameter, using brute force.

leaf_size

:int, optional (default = 30)Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem.

metric

:string or DistanceMetric object (default=’minkowski’)the distance metric to use for the tree. The default metric is minkowski, and with p=2 is equivalent to the standard Euclidean metric. See the documentation of the DistanceMetric class for a list of available metrics.

p

:integer, optional (default = 2)Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.

metric_params

:dict, optional (default = None)Additional keyword arguments for the metric function.

n_jobs

:int, optional (default = 1)The number of parallel jobs to run for neighbors search. If -1, then the number of jobs is set to the number of CPU cores. Doesn’t affect fit() method.

Examples

>>> X = [[0], [1], [2], [3]]
>>> y = [0, 0, 1, 1]
>>> from sklearn.neighbors import KNeighborsRegressor
>>> neigh = KNeighborsRegressor(n_neighbors=2)
>>> neigh.fit(X, y) 
KNeighborsRegressor(...)
>>> print(neigh.predict([[1.5]]))
[ 0.5]

See also

NearestNeighbors RadiusNeighborsRegressor KNeighborsClassifier RadiusNeighborsClassifier

Notes

See Nearest Neighbors in the online documentation for a discussion of the choice of algorithm and leaf_size.

Warning

Regarding the Nearest Neighbors algorithms, if it is found that two neighbors, neighbor k+1 and k, have identical distances but but different labels, the results will depend on the ordering of the training data.

http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

POSSIBLE NODE NAMES:
	KNeighborsRegressorSklearnNode KNeighborsRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.KernelCentererTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.KernelCentererTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Center a kernel matrix

This node has been automatically generated by wrapping the sklearn.preprocessing.data.KernelCenterer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Let K(x, z) be a kernel defined by phi(x)^T phi(z), where phi is a function mapping x to a Hilbert space. KernelCenterer centers (i.e., normalize to have zero mean) the data without explicitly computing phi(x). It is equivalent to centering phi(x) with sklearn.preprocessing.StandardScaler(with_std=False).

Read more in the User Guide.

POSSIBLE NODE NAMES:
	KernelCentererTransformerSklearnNode KernelCentererTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.KernelPCATransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.KernelPCATransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Kernel Principal component analysis (KPCA)

This node has been automatically generated by wrapping the sklearn.decomposition.kernel_pca.KernelPCA class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Non-linear dimensionality reduction through the use of kernels (see metrics).

Read more in the User Guide.

Parameters

n_components: int or None: Number of components. If None, all non-zero components are kept.
kernel: “linear” | “poly” | “rbf” | “sigmoid” | “cosine” | “precomputed”: Kernel. Default: “linear”
degree: Degree for poly kernels. Ignored by other kernels.
gamma: Kernel coefficient for rbf and poly kernels. Default: 1/n_features. Ignored by other kernels.
coef0: Independent term in poly and sigmoid kernels. Ignored by other kernels.
kernel_params: Parameters (keyword arguments) and values for kernel passed as callable object. Ignored by other kernels.
alpha: int: Hyperparameter of the ridge regression that learns the inverse transform (when fit_inverse_transform=True). Default: 1.0
fit_inverse_transform: bool: Learn the inverse transform for non-precomputed kernels. (i.e. learn to find the pre-image of a point) Default: False
eigen_solver: string [‘auto’|’dense’|’arpack’]: Select eigensolver to use. If n_components is much less than the number of training samples, arpack may be more efficient than the dense eigensolver.
tol: float: convergence tolerance for arpack. Default: 0 (optimal value will be chosen by arpack)
max_iter: maximum number of iterations for arpack Default: None (optimal value will be chosen by arpack)
remove_zero_eig: If True, then all components with zero eigenvalues are removed, so that the number of components in the output may be < n_components (and sometimes even zero due to numerical instability). When n_components is None, this parameter is ignored and components with zero eigenvalues are removed regardless.

Attributes

lambdas_ :

Eigenvalues of the centered kernel matrix

alphas_ :

Eigenvectors of the centered kernel matrix

dual_coef_ :

Inverse transform matrix

X_transformed_fit_ :

Projection of the fitted data on the kernel principal components

References

Kernel PCA was introduced in:

Bernhard Schoelkopf, Alexander J. Smola,

and Klaus-Robert Mueller. 1999. Kernel principal

component analysis. In Advances in kernel methods,

MIT Press, Cambridge, MA, USA 327-352.

POSSIBLE NODE NAMES:
	KernelPCATransformerSklearn KernelPCATransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.KernelRidgeRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.KernelRidgeRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Kernel ridge regression.

This node has been automatically generated by wrapping the sklearn.kernel_ridge.KernelRidge class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Kernel ridge regression (KRR) combines ridge regression (linear least squares with l2-norm regularization) with the kernel trick. It thus learns a linear function in the space induced by the respective kernel and the data. For non-linear kernels, this corresponds to a non-linear function in the original space.

The form of the model learned by KRR is identical to support vector regression (SVR). However, different loss functions are used: KRR uses squared error loss while support vector regression uses epsilon-insensitive loss, both combined with l2 regularization. In contrast to SVR, fitting a KRR model can be done in closed-form and is typically faster for medium-sized datasets. On the other hand, the learned model is non-sparse and thus slower than SVR, which learns a sparse model for epsilon > 0, at prediction-time.

This estimator has built-in support for multi-variate regression (i.e., when y is a 2d-array of shape [n_samples, n_targets]).

Read more in the User Guide.

Parameters

alpha: Small positive values of alpha improve the conditioning of the problem and reduce the variance of the estimates. Alpha corresponds to (2*C)^-1 in other linear models such as LogisticRegression or LinearSVC. If an array is passed, penalties are assumed to be specific to the targets. Hence they must correspond in number.
kernel: Kernel mapping used internally. A callable should accept two arguments and the keyword arguments passed to this object as kernel_params, and should return a floating point number.
gamma: Gamma parameter for the RBF, laplacian, polynomial, exponential chi2 and sigmoid kernels. Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. Ignored by other kernels.
degree: Degree of the polynomial kernel. Ignored by other kernels.
coef0: Zero coefficient for polynomial and sigmoid kernels. Ignored by other kernels.
kernel_params: Additional parameters (keyword arguments) for kernel function passed as callable object.

Attributes

dual_coef_: Weight vector(s) in kernel space
X_fit_: Training data, which is also required for prediction

References

Kevin P. Murphy “Machine Learning: A Probabilistic Perspective”, The MIT Press chapter 14.4.3, pp. 492-493

See also

Ridge: Linear ridge regression.
SVR: Support Vector Regression implemented using libsvm.

Examples

>>> from sklearn.kernel_ridge import KernelRidge
>>> import numpy as np
>>> n_samples, n_features = 10, 5
>>> rng = np.random.RandomState(0)
>>> y = rng.randn(n_samples)
>>> X = rng.randn(n_samples, n_features)
>>> clf = KernelRidge(alpha=1.0)
>>> clf.fit(X, y) 
KernelRidge(alpha=1.0, coef0=1, degree=3, gamma=None, kernel='linear',
            kernel_params=None)

POSSIBLE NODE NAMES:
	KernelRidgeRegressorSklearnNode KernelRidgeRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LabelBinarizerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LabelBinarizerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Binarize labels in a one-vs-all fashion

This node has been automatically generated by wrapping the sklearn.preprocessing.label.LabelBinarizer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Several regression and binary classification algorithms are available in the scikit. A simple way to extend these algorithms to the multi-class classification case is to use the so-called one-vs-all scheme.

At learning time, this simply consists in learning one regressor or binary classifier per class. In doing so, one needs to convert multi-class labels to binary labels (belong or does not belong to the class). LabelBinarizer makes this process easy with the transform method.

At prediction time, one assigns the class for which the corresponding model gave the greatest confidence. LabelBinarizer makes this easy with the inverse_transform method.

Read more in the User Guide.

Parameters

neg_label: Value with which negative labels must be encoded.
pos_label: Value with which positive labels must be encoded.
sparse_output: True if the returned array from transform is desired to be in sparse CSR format.

Attributes

classes_: Holds the label for each class.
y_type_: Represents the type of the target data as evaluated by utils.multiclass.type_of_target. Possible type are ‘continuous’, ‘continuous-multioutput’, ‘binary’, ‘multiclass’, ‘mutliclass-multioutput’, ‘multilabel-indicator’, and ‘unknown’.
multilabel_: True if the transformer was fitted on a multilabel rather than a multiclass set of labels. The multilabel_ attribute is deprecated and will be removed in 0.18
sparse_input_: True if the input data to transform is given as a sparse matrix, False otherwise.
indicator_matrix_: ‘sparse’ when the input data to tansform is a multilable-indicator and is sparse, None otherwise. The indicator_matrix_ attribute is deprecated as of version 0.16 and will be removed in 0.18

Examples

>>> from sklearn import preprocessing
>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit([1, 2, 6, 4, 2])
LabelBinarizer(neg_label=0, pos_label=1, sparse_output=False)
>>> lb.classes_
array([1, 2, 4, 6])
>>> lb.transform([1, 6])
array([[1, 0, 0, 0],
       [0, 0, 0, 1]])

Binary targets transform to a column vector

>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
       [0],
       [0],
       [1]])

Passing a 2D matrix for multilabel classification

>>> import numpy as np
>>> lb.fit(np.array([[0, 1, 1], [1, 0, 0]]))
LabelBinarizer(neg_label=0, pos_label=1, sparse_output=False)
>>> lb.classes_
array([0, 1, 2])
>>> lb.transform([0, 1, 2, 1])
array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1],
       [0, 1, 0]])

See also

label_binarize: LabelBinarizer with fixed classes.

POSSIBLE NODE NAMES:
	LabelBinarizerTransformerSklearnNode LabelBinarizerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LabelEncoderTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LabelEncoderTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Encode labels with value between 0 and n_classes-1.

This node has been automatically generated by wrapping the sklearn.preprocessing.label.LabelEncoder class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Attributes

classes_: Holds the label for each class.

Examples

LabelEncoder can be used to normalize labels.

>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6]) 
array([0, 0, 1, 2]...)
>>> le.inverse_transform([0, 0, 1, 2])
array([1, 1, 2, 6])

It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels.

>>> le = preprocessing.LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo']
>>> le.transform(["tokyo", "tokyo", "paris"]) 
array([2, 2, 1]...)
>>> list(le.inverse_transform([2, 2, 1]))
['tokyo', 'tokyo', 'paris']

POSSIBLE NODE NAMES:
	LabelEncoderTransformerSklearnNode LabelEncoderTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LabelPropagationClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LabelPropagationClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Label Propagation classifier

This node has been automatically generated by wrapping the sklearn.semi_supervised.label_propagation.LabelPropagation class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

kernel: String identifier for kernel function to use. Only ‘rbf’ and ‘knn’ kernels are currently supported..
gamma: Parameter for rbf kernel
n_neighbors: Parameter for knn kernel
alpha: Clamping factor
max_iter: Change maximum number of iterations allowed
tol: Convergence tolerance: threshold to consider the system at steady state

Attributes

X_: Input array.
classes_: The distinct labels used in classifying instances.
label_distributions_: Categorical distribution for each item.
transduction_: Label assigned to each item via the transduction.
n_iter_: Number of iterations run.

Examples

>>> from sklearn import datasets
>>> from sklearn.semi_supervised import LabelPropagation
>>> label_prop_model = LabelPropagation()
>>> iris = datasets.load_iris()
>>> random_unlabeled_points = np.where(np.random.random_integers(0, 1,
...    size=len(iris.target)))
>>> labels = np.copy(iris.target)
>>> labels[random_unlabeled_points] = -1
>>> label_prop_model.fit(iris.data, labels)
... 
LabelPropagation(...)

References

Xiaojin Zhu and Zoubin Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University, 2002 http://pages.cs.wisc.edu/~jerryzhu/pub/CMU-CALD-02-107.pdf

See Also

LabelSpreading : Alternate label propagation strategy more robust to noise

POSSIBLE NODE NAMES:
	LabelPropagationClassifierSklearn LabelPropagationClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LabelSpreadingClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LabelSpreadingClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

LabelSpreading model for semi-supervised learning

This node has been automatically generated by wrapping the sklearn.semi_supervised.label_propagation.LabelSpreading class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This model is similar to the basic Label Propgation algorithm, but uses affinity matrix based on the normalized graph Laplacian and soft clamping across the labels.

Read more in the User Guide.

Parameters

kernel: String identifier for kernel function to use. Only ‘rbf’ and ‘knn’ kernels are currently supported.
gamma: parameter for rbf kernel
n_neighbors: parameter for knn kernel
alpha: clamping factor
max_iter: maximum number of iterations allowed
tol: Convergence tolerance: threshold to consider the system at steady state

Attributes

X_: Input array.
classes_: The distinct labels used in classifying instances.
label_distributions_: Categorical distribution for each item.
transduction_: Label assigned to each item via the transduction.
n_iter_: Number of iterations run.

Examples

>>> from sklearn import datasets
>>> from sklearn.semi_supervised import LabelSpreading
>>> label_prop_model = LabelSpreading()
>>> iris = datasets.load_iris()
>>> random_unlabeled_points = np.where(np.random.random_integers(0, 1,
...    size=len(iris.target)))
>>> labels = np.copy(iris.target)
>>> labels[random_unlabeled_points] = -1
>>> label_prop_model.fit(iris.data, labels)
... 
LabelSpreading(...)

References

Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, Bernhard Schoelkopf. Learning with local and global consistency (2004) http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.115.3219

See Also

LabelPropagation : Unregularized graph based semi-supervised learning

POSSIBLE NODE NAMES:
	LabelSpreadingClassifierSklearnNode LabelSpreadingClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LarsCVRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LarsCVRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Cross-validated Least Angle Regression model

This node has been automatically generated by wrapping the sklearn.linear_model.least_angle.LarsCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

fit_intercept

:booleanwhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

positive

:boolean (default=False)Restrict coefficients to be >= 0. Be aware that you might want to remove fit_intercept which is set True by default.

verbose

:boolean or integer, optionalSets the verbosity amount

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

copy_X

:boolean, optional, default TrueIf True, X will be copied; else, it may be overwritten.

precompute

:True | False | ‘auto’ | array-likeWhether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

max_iter: integer, optional

Maximum number of iterations to perform.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

max_n_alphas

:integer, optionalThe maximum number of points on the path used to compute the residuals in the cross-validation

n_jobs

:integer, optionalNumber of CPUs to use during the cross validation. If -1, use all the CPUs

eps

:float, optionalThe machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems.

Attributes

coef_: parameter vector (w in the formulation formula)
intercept_: independent term in decision function
coef_path_: the varying values of the coefficients along the path
alpha_: the estimated regularization parameter alpha
alphas_: the different values of alpha along the path
cv_alphas_: all the values of alpha along the path for the different folds
cv_mse_path_: the mean square error on left-out for each fold along the path (alpha values given by cv_alphas)
n_iter_: the number of iterations run by Lars with the optimal alpha.

See also

lars_path, LassoLars, LassoLarsCV

POSSIBLE NODE NAMES:
	LarsCVRegressorSklearn LarsCVRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LarsRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LarsRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Least Angle Regression model a.k.a. LAR

This node has been automatically generated by wrapping the sklearn.linear_model.least_angle.Lars class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

n_nonzero_coefs: Target number of non-zero coefficients. Use np.inf for no limit.
fit_intercept: Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
positive: Restrict coefficients to be >= 0. Be aware that you might want to remove fit_intercept which is set True by default.
verbose: Sets the verbosity amount
normalize: If True, the regressors X will be normalized before regression.
precompute: Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.
copy_X: If True, X will be copied; else, it may be overwritten.
eps: The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. Unlike the tol parameter in some iterative optimization-based algorithms, this parameter does not control the tolerance of the optimization.
fit_path: If True the full path is stored in the coef_path_ attribute. If you compute the solution for a large problem or many targets, setting fit_path to False will lead to a speedup, especially with a small alpha.

Attributes

alphas_: Maximum of covariances (in absolute value) at each iteration. n_alphas is either n_nonzero_coefs or n_features, whichever is smaller.
active_: Indices of active variables at the end of the path.
coef_path_: The varying values of the coefficients along the path. It is not present if the fit_path parameter is False.
coef_: Parameter vector (w in the formulation formula).
intercept_: Independent term in decision function.
n_iter_: The number of iterations taken by lars_path to find the grid of alphas for each target.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.Lars(n_nonzero_coefs=1)
>>> clf.fit([[-1, 1], [0, 0], [1, 1]], [-1.1111, 0, -1.1111])
... 
Lars(copy_X=True, eps=..., fit_intercept=True, fit_path=True,
   n_nonzero_coefs=1, normalize=True, positive=False, precompute='auto',
   verbose=False)
>>> print(clf.coef_) 
[ 0. -1.11...]

See also

lars_path, LarsCV sklearn.decomposition.sparse_encode

POSSIBLE NODE NAMES:
	LarsRegressorSklearnNode LarsRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LassoCVRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LassoCVRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Lasso linear model with iterative fitting along a regularization path

This node has been automatically generated by wrapping the sklearn.linear_model.coordinate_descent.LassoCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The best model is selected by cross-validation.

The optimization objective for Lasso is:

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Read more in the User Guide.

Parameters

eps

:float, optionalLength of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

n_alphas

:int, optionalNumber of alphas along the regularization path

alphas

:numpy array, optionalList of alphas where to compute the models. If None alphas are set automatically

precompute

:True | False | ‘auto’ | array-likeWhether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

max_iter

:int, optionalThe maximum number of iterations

tol

:float, optionalThe tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

verbose

:bool or integerAmount of verbosity.

n_jobs

:integer, optionalNumber of CPUs to use during the cross validation. If -1, use all the CPUs.

positive

:bool, optionalIf positive, restrict regression coefficients to be positive

selection

:str, default ‘cyclic’If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to ‘random’) often leads to significantly faster convergence especially when tol is higher than 1e-4.

random_state

:int, RandomState instance, or None (default)The seed of the pseudo random number generator that selects a random feature to update. Useful only when selection is set to ‘random’.

fit_intercept

:boolean, default Truewhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

copy_X

:boolean, optional, default TrueIf True, X will be copied; else, it may be overwritten.

Attributes

alpha_: The amount of penalization chosen by cross validation
coef_: parameter vector (w in the cost function formula)
intercept_: independent term in decision function.
mse_path_: mean square error for the test set on each fold, varying alpha
alphas_: The grid of alphas used for fitting
dual_gap_: The dual gap at the end of the optimization for the optimal alpha (alpha_).
n_iter_: number of iterations run by the coordinate descent solver to reach the specified tolerance for the optimal alpha.

Notes

See examples/linear_model/lasso_path_with_crossvalidation.py for an example.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

See also

lars_path lasso_path LassoLars Lasso LassoLarsCV

POSSIBLE NODE NAMES:
	LassoCVRegressorSklearnNode LassoCVRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LassoLarsCVRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LassoLarsCVRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Cross-validated Lasso, using the LARS algorithm

This node has been automatically generated by wrapping the sklearn.linear_model.least_angle.LassoLarsCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The optimization objective for Lasso is:

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Read more in the User Guide.

Parameters

fit_intercept

:booleanwhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

positive

:boolean (default=False)Restrict coefficients to be >= 0. Be aware that you might want to remove fit_intercept which is set True by default. Under the positive restriction the model coefficients do not converge to the ordinary-least-squares solution for small values of alpha. Only coeffiencts up to the smallest alpha value (

alphas_[alphas_ >
0.].min()

when fit_path=True) reached by the stepwise Lars-Lasso algorithm are typically in congruence with the solution of the coordinate descent Lasso estimator. As a consequence using LassoLarsCV only makes sense for problems where a sparse solution is expected and/or reached.

verbose

:boolean or integer, optionalSets the verbosity amount

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

precompute

:True | False | ‘auto’ | array-likeWhether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

max_iter

:integer, optionalMaximum number of iterations to perform.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

max_n_alphas

:integer, optionalThe maximum number of points on the path used to compute the residuals in the cross-validation

n_jobs

:integer, optionalNumber of CPUs to use during the cross validation. If -1, use all the CPUs

eps

:float, optionalThe machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems.

copy_X

:boolean, optional, default TrueIf True, X will be copied; else, it may be overwritten.

Attributes

coef_: parameter vector (w in the formulation formula)
intercept_: independent term in decision function.
coef_path_: the varying values of the coefficients along the path
alpha_: the estimated regularization parameter alpha
alphas_: the different values of alpha along the path
cv_alphas_: all the values of alpha along the path for the different folds
cv_mse_path_: the mean square error on left-out for each fold along the path (alpha values given by cv_alphas)
n_iter_: the number of iterations run by Lars with the optimal alpha.

Notes

The object solves the same problem as the LassoCV object. However, unlike the LassoCV, it find the relevant alphas values by itself. In general, because of this property, it will be more stable. However, it is more fragile to heavily multicollinear datasets.

It is more efficient than the LassoCV if only a small number of features are selected compared to the total number, for instance if there are very few samples compared to the number of features.

See also

lars_path, LassoLars, LarsCV, LassoCV

POSSIBLE NODE NAMES:
	LassoLarsCVRegressorSklearnNode LassoLarsCVRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LassoLarsICRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LassoLarsICRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Lasso model fit with Lars using BIC or AIC for model selection

This node has been automatically generated by wrapping the sklearn.linear_model.least_angle.LassoLarsIC class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The optimization objective for Lasso is:

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

AIC is the Akaike information criterion and BIC is the Bayes Information criterion. Such criteria are useful to select the value of the regularization parameter by making a trade-off between the goodness of fit and the complexity of the model. A good model should explain well the data while being simple.

Read more in the User Guide.

Parameters

criterion: The type of criterion to use.
fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
positive: Restrict coefficients to be >= 0. Be aware that you might want to remove fit_intercept which is set True by default. Under the positive restriction the model coefficients do not converge to the ordinary-least-squares solution for small values of alpha. Only coeffiencts up to the smallest alpha value (alphas_[alphas_ > 0.].min() when fit_path=True) reached by the stepwise Lars-Lasso algorithm are typically in congruence with the solution of the coordinate descent Lasso estimator. As a consequence using LassoLarsIC only makes sense for problems where a sparse solution is expected and/or reached.
verbose: Sets the verbosity amount
normalize: If True, the regressors X will be normalized before regression.
copy_X: If True, X will be copied; else, it may be overwritten.
precompute: Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.
max_iter: Maximum number of iterations to perform. Can be used for early stopping.
eps: The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. Unlike the tol parameter in some iterative optimization-based algorithms, this parameter does not control the tolerance of the optimization.

Attributes

coef_: parameter vector (w in the formulation formula)
intercept_: independent term in decision function.
alpha_: the alpha parameter chosen by the information criterion
n_iter_: number of iterations run by lars_path to find the grid of alphas.
criterion_: The value of the information criteria (‘aic’, ‘bic’) across all alphas. The alpha which has the smallest information criteria is chosen.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.LassoLarsIC(criterion='bic')
>>> clf.fit([[-1, 1], [0, 0], [1, 1]], [-1.1111, 0, -1.1111])
... 
LassoLarsIC(copy_X=True, criterion='bic', eps=..., fit_intercept=True,
      max_iter=500, normalize=True, positive=False, precompute='auto',
      verbose=False)
>>> print(clf.coef_) 
[ 0.  -1.11...]

Notes

The estimation of the number of degrees of freedom is given by:

“On the degrees of freedom of the lasso” Hui Zou, Trevor Hastie, and Robert Tibshirani Ann. Statist. Volume 35, Number 5 (2007), 2173-2192.

http://en.wikipedia.org/wiki/Akaike_information_criterion http://en.wikipedia.org/wiki/Bayesian_information_criterion

See also

lars_path, LassoLars, LassoLarsCV

POSSIBLE NODE NAMES:
	LassoLarsICRegressorSklearn LassoLarsICRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LassoLarsRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LassoLarsRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Lasso model fit with Least Angle Regression a.k.a. Lars

This node has been automatically generated by wrapping the sklearn.linear_model.least_angle.LassoLars class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

It is a Linear Model trained with an L1 prior as regularizer.

The optimization objective for Lasso is:

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Read more in the User Guide.

Parameters

alpha: Constant that multiplies the penalty term. Defaults to 1.0. alpha = 0 is equivalent to an ordinary least square, solved by LinearRegression. For numerical reasons, using alpha = 0 with the LassoLars object is not advised and you should prefer the LinearRegression object.
fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
positive: Restrict coefficients to be >= 0. Be aware that you might want to remove fit_intercept which is set True by default. Under the positive restriction the model coefficients will not converge to the ordinary-least-squares solution for small values of alpha. Only coeffiencts up to the smallest alpha value (alphas_[alphas_ > 0.].min() when fit_path=True) reached by the stepwise Lars-Lasso algorithm are typically in congruence with the solution of the coordinate descent Lasso estimator.
verbose: Sets the verbosity amount
normalize: If True, the regressors X will be normalized before regression.
copy_X: If True, X will be copied; else, it may be overwritten.
precompute: Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.
max_iter: Maximum number of iterations to perform.
eps: The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. Unlike the tol parameter in some iterative optimization-based algorithms, this parameter does not control the tolerance of the optimization.
fit_path: If True the full path is stored in the coef_path_ attribute. If you compute the solution for a large problem or many targets, setting fit_path to False will lead to a speedup, especially with a small alpha.

Attributes

alphas_: Maximum of covariances (in absolute value) at each iteration. n_alphas is either max_iter, n_features, or the number of nodes in the path with correlation greater than alpha, whichever is smaller.
active_: Indices of active variables at the end of the path.
coef_path_: If a list is passed it’s expected to be one of n_targets such arrays. The varying values of the coefficients along the path. It is not present if the fit_path parameter is False.
coef_: Parameter vector (w in the formulation formula).
intercept_: Independent term in decision function.
n_iter_: The number of iterations taken by lars_path to find the grid of alphas for each target.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.LassoLars(alpha=0.01)
>>> clf.fit([[-1, 1], [0, 0], [1, 1]], [-1, 0, -1])
... 
LassoLars(alpha=0.01, copy_X=True, eps=..., fit_intercept=True,
     fit_path=True, max_iter=500, normalize=True, positive=False,
     precompute='auto', verbose=False)
>>> print(clf.coef_) 
[ 0.         -0.963257...]

See also

lars_path lasso_path Lasso LassoCV LassoLarsCV sklearn.decomposition.sparse_encode

POSSIBLE NODE NAMES:
	LassoLarsRegressorSklearn LassoLarsRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LassoRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LassoRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Linear Model trained with L1 prior as regularizer (aka the Lasso)

This node has been automatically generated by wrapping the sklearn.linear_model.coordinate_descent.Lasso class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The optimization objective for Lasso is:

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Technically the Lasso model is optimizing the same objective function as the Elastic Net with l1_ratio=1.0 (no L2 penalty).

Read more in the User Guide.

Parameters

alpha: Constant that multiplies the L1 term. Defaults to 1.0. alpha = 0 is equivalent to an ordinary least square, solved by the LinearRegression object. For numerical reasons, using alpha = 0 is with the Lasso object is not advised and you should prefer the LinearRegression object.
fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalize: If True, the regressors X will be normalized before regression.
copy_X: If True, X will be copied; else, it may be overwritten.
precompute: Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument. For sparse input this option is always True to preserve sparsity. WARNING : The 'auto' option is deprecated and will be removed in 0.18.
max_iter: The maximum number of iterations
tol: The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.
warm_start: When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.
positive: When set to True, forces the coefficients to be positive.
selection: If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to ‘random’) often leads to significantly faster convergence especially when tol is higher than 1e-4.
random_state: The seed of the pseudo random number generator that selects a random feature to update. Useful only when selection is set to ‘random’.

Attributes

coef_: parameter vector (w in the cost function formula)
sparse_coef_: sparse_coef_ is a readonly property derived from coef_
intercept_: independent term in decision function.
n_iter_: number of iterations run by the coordinate descent solver to reach the specified tolerance.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.Lasso(alpha=0.1)
>>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
Lasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=1000,
   normalize=False, positive=False, precompute=False, random_state=None,
   selection='cyclic', tol=0.0001, warm_start=False)
>>> print(clf.coef_)
[ 0.85  0.  ]
>>> print(clf.intercept_)
0.15

See also

lars_path lasso_path LassoLars LassoCV LassoLarsCV sklearn.decomposition.sparse_encode

Notes

The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

POSSIBLE NODE NAMES:
	LassoRegressorSklearn LassoRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LatentDirichletAllocationTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LatentDirichletAllocationTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Latent Dirichlet Allocation with online variational Bayes algorithm

This node has been automatically generated by wrapping the sklearn.decomposition.online_lda.LatentDirichletAllocation class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

New in version 0.17.

Parameters

n_topics

:int, optional (default=10)Number of topics.

doc_topic_prior

:float, optional (default=None)Prior of document topic distribution theta. If the value is None, defaults to 1 / n_topics. In the literature, this is called alpha.

topic_word_prior

:float, optional (default=None)Prior of topic word distribution beta. If the value is None, defaults to 1 / n_topics. In the literature, this is called eta.

learning_method

:‘batch’ | ‘online’, default=’online’

Method used to update _component. Only used in fit method. In general, if the data size is large, the online update will be much faster than the batch update. Valid options:

'batch': Batch variational Bayes method. Use all training data in
    each EM update.
    Old `components_` will be overwritten in each iteration.
'online': Online variational Bayes method. In each EM update, use
    mini-batch of training data to update the ``components_``
    variable incrementally. The learning rate is controlled by the
    ``learning_decay`` and the ``learning_offset`` parameters.

learning_decay

:float, optional (default=0.7)It is a parameter that control learning rate in the online learning method. The value should be set between (0.5, 1.0] to guarantee asymptotic convergence. When the value is 0.0 and batch_size is n_samples, the update method is same as batch learning. In the literature, this is called kappa.

learning_offset

:float, optional (default=10.)A (positive) parameter that downweights early iterations in online learning. It should be greater than 1.0. In the literature, this is called tau_0.

max_iter

:integer, optional (default=10)The maximum number of iterations.

total_samples

:int, optional (default=1e6)Total number of documents. Only used in the partial_fit method.

batch_size

:int, optional (default=128)Number of documents to use in each EM iteration. Only used in online learning.

evaluate_every

:int optional (default=0)How often to evaluate perplexity. Only used in fit method. set it to 0 or and negative number to not evalute perplexity in training at all. Evaluating perplexity can help you check convergence in training process, but it will also increase total training time. Evaluating perplexity in every iteration might increase training time up to two-fold.

perp_tol

:float, optional (default=1e-1)Perplexity tolerance in batch learning. Only used when evaluate_every is greater than 0.

mean_change_tol

:float, optional (default=1e-3)Stopping tolerance for updating document topic distribution in E-step.

max_doc_update_iter

:int (default=100)Max number of iterations for updating document topic distribution in the E-step.

n_jobs

:int, optional (default=1)The number of jobs to use in the E-step. If -1, all CPUs are used. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used.

verbose

:int, optional (default=0)Verbosity level.

random_state

:int or RandomState instance or None, optional (default=None)Pseudo-random number generator seed control.

Attributes

components_: Topic word distribution. components_[i, j] represents word j in topic i. In the literature, this is called lambda.
n_batch_iter_: Number of iterations of the EM step.
n_iter_: Number of passes over the dataset.

References

[1] “Online Learning for Latent Dirichlet Allocation”, Matthew D. Hoffman,: David M. Blei, Francis Bach, 2010
[2] “Stochastic Variational Inference”, Matthew D. Hoffman, David M. Blei,: Chong Wang, John Paisley, 2013

[3] Matthew D. Hoffman’s onlineldavb code. Link:

http://www.cs.princeton.edu/~mdhoffma/code/onlineldavb.tar

POSSIBLE NODE NAMES:
	LatentDirichletAllocationTransformerSklearn LatentDirichletAllocationTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LinearDiscriminantAnalysisClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LinearDiscriminantAnalysisClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Linear Discriminant Analysis

This node has been automatically generated by wrapping the sklearn.discriminant_analysis.LinearDiscriminantAnalysis class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes’ rule.

The model fits a Gaussian density to each class, assuming that all classes share the same covariance matrix.

The fitted model can also be used to reduce the dimensionality of the input by projecting it to the most discriminative directions.

New in version 0.17: LinearDiscriminantAnalysis.

Changed in version 0.17: Deprecated lda.LDA have been moved to LinearDiscriminantAnalysis.

Parameters

solver

:string, optional

Solver to use, possible values:

‘svd’: Singular value decomposition (default). Does not compute the

covariance matrix, therefore this solver is recommended for

data with a large number of features.

‘lsqr’: Least squares solution, can be combined with shrinkage.

‘eigen’: Eigenvalue decomposition, can be combined with shrinkage.

shrinkage

:string or float, optional

Shrinkage parameter, possible values:

None: no shrinkage (default).

‘auto’: automatic shrinkage using the Ledoit-Wolf lemma.

float between 0 and 1: fixed shrinkage parameter.

Note that shrinkage works only with ‘lsqr’ and ‘eigen’ solvers.

priors

:array, optional, shape (n_classes,)Class priors.

n_components

:int, optionalNumber of components (< n_classes - 1) for dimensionality reduction.

store_covariance

:bool, optional

Additionally compute class covariance matrix (default False).

New in version 0.17.

tol

:float, optional

Threshold used for rank estimation in SVD solver.

New in version 0.17.

Attributes

coef_: Weight vector(s).
intercept_: Intercept term.
covariance_: Covariance matrix (shared by all classes).
explained_variance_ratio_: Percentage of variance explained by each of the selected components. If n_components is not set then all components are stored and the sum of explained variances is equal to 1.0. Only available when eigen solver is used.
means_: Class means.
priors_: Class priors (sum to 1).
scalings_: Scaling of the features in the space spanned by the class centroids.
xbar_: Overall mean.
classes_: Unique class labels.

See also

sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis: Quadratic: Discriminant Analysis

Notes

The default solver is ‘svd’. It can perform both classification and transform, and it does not rely on the calculation of the covariance matrix. This can be an advantage in situations where the number of features is large. However, the ‘svd’ solver cannot be used with shrinkage.

The ‘lsqr’ solver is an efficient algorithm that only works for classification. It supports shrinkage.

The ‘eigen’ solver is based on the optimization of the between class scatter to within class scatter ratio. It can be used for both classification and transform, and it supports shrinkage. However, the ‘eigen’ solver needs to compute the covariance matrix, so it might not be suitable for situations with a high number of features.

Examples

>>> import numpy as np
>>> from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> y = np.array([1, 1, 1, 2, 2, 2])
>>> clf = LinearDiscriminantAnalysis()
>>> clf.fit(X, y)
LinearDiscriminantAnalysis(n_components=None, priors=None, shrinkage=None,
              solver='svd', store_covariance=False, tol=0.0001)
>>> print(clf.predict([[-0.8, -1]]))
[1]

POSSIBLE NODE NAMES:
	LinearDiscriminantAnalysisClassifierSklearn LinearDiscriminantAnalysisClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LinearRegressionSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LinearRegressionSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Ordinary least squares Linear Regression.

This node has been automatically generated by wrapping the sklearn.linear_model.base.LinearRegression class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Parameters

fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalize: If True, the regressors X will be normalized before regression.
copy_X: If True, X will be copied; else, it may be overwritten.
n_jobs: The number of jobs to use for the computation. If -1 all CPUs are used. This will only provide speedup for n_targets > 1 and sufficient large problems.

Attributes

coef_: Estimated coefficients for the linear regression problem. If multiple targets are passed during the fit (y 2D), this is a 2D array of shape (n_targets, n_features), while if only one target is passed, this is a 1D array of length n_features.
intercept_: Independent term in the linear model.

Notes

From the implementation point of view, this is just plain Ordinary Least Squares (scipy.linalg.lstsq) wrapped as a predictor object.

POSSIBLE NODE NAMES:
	LinearRegressionSklearn LinearRegressionSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LinearSVCClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LinearSVCClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Linear Support Vector Classification.

This node has been automatically generated by wrapping the sklearn.svm.classes.LinearSVC class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Similar to SVC with parameter kernel=’linear’, but implemented in terms of liblinear rather than libsvm, so it has more flexibility in the choice of penalties and loss functions and should scale better to large numbers of samples.

This class supports both dense and sparse input and the multiclass support is handled according to a one-vs-the-rest scheme.

Read more in the User Guide.

Parameters

C: Penalty parameter C of the error term.
loss: Specifies the loss function. ‘hinge’ is the standard SVM loss (used e.g. by the SVC class) while ‘squared_hinge’ is the square of the hinge loss.
penalty: Specifies the norm used in the penalization. The ‘l2’ penalty is the standard used in SVC. The ‘l1’ leads to coef_ vectors that are sparse.
dual: Select the algorithm to either solve the dual or primal optimization problem. Prefer dual=False when n_samples > n_features.
tol: Tolerance for stopping criteria.
multi_class: string, ‘ovr’ or ‘crammer_singer’ (default=’ovr’): Determines the multi-class strategy if y contains more than two classes. "ovr" trains n_classes one-vs-rest classifiers, while "crammer_singer" optimizes a joint objective over all classes. While crammer_singer is interesting from a theoretical perspective as it is consistent, it is seldom used in practice as it rarely leads to better accuracy and is more expensive to compute. If "crammer_singer" is chosen, the options loss, penalty and dual will be ignored.
fit_intercept: Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).
intercept_scaling: When self.fit_intercept is True, instance vector x becomes [x, self.intercept_scaling], i.e. a “synthetic” feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.
class_weight: Set the parameter C of class i to class_weight[i]*C for SVC. If not given, all classes are supposed to have weight one. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))
verbose: Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in liblinear that, if enabled, may not work properly in a multithreaded context.
random_state: The seed of the pseudo random number generator to use when shuffling the data.
max_iter: The maximum number of iterations to be run.

Attributes

coef_

:array, shape = [n_features] if n_classes == 2 else [n_classes, n_features]

Weights assigned to the features (coefficients in the primal problem). This is only available in the case of a linear kernel.

coef_ is a readonly property derived from raw_coef_ that follows the internal memory layout of liblinear.

intercept_

:array, shape = [1] if n_classes == 2 else [n_classes]Constants in decision function.

Notes

The underlying C implementation uses a random number generator to select features when fitting the model. It is thus not uncommon to have slightly different results for the same input data. If that happens, try with a smaller tol parameter.

The underlying implementation, liblinear, uses a sparse internal representation for the data that will incur a memory copy.

Predict output may not match that of standalone liblinear in certain cases. See differences from liblinear in the narrative documentation.

References

LIBLINEAR: A Library for Large Linear Classification

See also

SVC

Implementation of Support Vector Machine classifier using libsvm:

the kernel can be non-linear but its SMO algorithm does not
scale to large number of samples as LinearSVC does.

Furthermore SVC multi-class mode is implemented using one vs one scheme while LinearSVC uses one vs the rest. It is possible to implement one vs the rest with SVC by using the sklearn.multiclass.OneVsRestClassifier wrapper.

Finally SVC can fit dense data without memory copy if the input is C-contiguous. Sparse data will still incur memory copy though.

sklearn.linear_model.SGDClassifier

SGDClassifier can optimize the same cost function as LinearSVC by adjusting the penalty and loss parameters. In addition it requires less memory, allows incremental (online) learning, and implements various loss functions and regularization regimes.

POSSIBLE NODE NAMES:
	LinearSVCClassifierSklearnNode LinearSVCClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LinearSVRRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LinearSVRRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Linear Support Vector Regression.

This node has been automatically generated by wrapping the sklearn.svm.classes.LinearSVR class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Similar to SVR with parameter kernel=’linear’, but implemented in terms of liblinear rather than libsvm, so it has more flexibility in the choice of penalties and loss functions and should scale better to large numbers of samples.

This class supports both dense and sparse input.

Read more in the User Guide.

Parameters

C: Penalty parameter C of the error term. The penalty is a squared l2 penalty. The bigger this parameter, the less regularization is used.
loss: Specifies the loss function. ‘l1’ is the epsilon-insensitive loss (standard SVR) while ‘l2’ is the squared epsilon-insensitive loss.
epsilon: Epsilon parameter in the epsilon-insensitive loss function. Note that the value of this parameter depends on the scale of the target variable y. If unsure, set epsilon=0.
dual: Select the algorithm to either solve the dual or primal optimization problem. Prefer dual=False when n_samples > n_features.
tol: Tolerance for stopping criteria.
fit_intercept: Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).
intercept_scaling: When self.fit_intercept is True, instance vector x becomes [x, self.intercept_scaling], i.e. a “synthetic” feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.
verbose: Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in liblinear that, if enabled, may not work properly in a multithreaded context.
random_state: The seed of the pseudo random number generator to use when shuffling the data.
max_iter: The maximum number of iterations to be run.

Attributes

coef_

:array, shape = [n_features] if n_classes == 2 else [n_classes, n_features]

Weights assigned to the features (coefficients in the primal problem). This is only available in the case of a linear kernel.

coef_ is a readonly property derived from raw_coef_ that follows the internal memory layout of liblinear.

intercept_

:array, shape = [1] if n_classes == 2 else [n_classes]Constants in decision function.

See also

LinearSVC

Implementation of Support Vector Machine classifier using the same library as this class (liblinear).

SVR

Implementation of Support Vector Machine regression using libsvm:

the kernel can be non-linear but its SMO algorithm does not
scale to large number of samples as LinearSVC does.

sklearn.linear_model.SGDRegressor

SGDRegressor can optimize the same cost function as LinearSVR by adjusting the penalty and loss parameters. In addition it requires less memory, allows incremental (online) learning, and implements various loss functions and regularization regimes.

POSSIBLE NODE NAMES:
	LinearSVRRegressorSklearn LinearSVRRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LocallyLinearEmbeddingTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LocallyLinearEmbeddingTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Locally Linear Embedding

This node has been automatically generated by wrapping the sklearn.manifold.locally_linear.LocallyLinearEmbedding class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

n_neighbors

:integernumber of neighbors to consider for each point.

n_components

:integernumber of coordinates for the manifold

reg

:floatregularization constant, multiplies the trace of the local covariance matrix of the distances.

eigen_solver

:string, {‘auto’, ‘arpack’, ‘dense’}

auto : algorithm will attempt to choose the best method for input data

arpack: For this method, M may be a dense matrix, sparse matrix, or general linear operator. Warning: ARPACK can be unstable for some problems. It is best to try several random seeds in order to check results.
dense: decomposition. For this method, M must be an array or matrix type. This method should be avoided for large problems.

tol

:float, optionalTolerance for ‘arpack’ method Not used if eigen_solver==’dense’.

max_iter

:integermaximum number of iterations for the arpack solver. Not used if eigen_solver==’dense’.

method

:string (‘standard’, ‘hessian’, ‘modified’ or ‘ltsa’)

standard: reference [1]
hessian: n_neighbors > n_components * (1 + (n_components + 1) / 2 see reference [2]
modified: see reference [3]
ltsa: see reference [4]

hessian_tol

:float, optionalTolerance for Hessian eigenmapping method. Only used if method == 'hessian'

modified_tol

:float, optionalTolerance for modified LLE method. Only used if method == 'modified'

neighbors_algorithm

:string [‘auto’|’brute’|’kd_tree’|’ball_tree’]algorithm to use for nearest neighbors search, passed to neighbors.NearestNeighbors instance

random_state: numpy.RandomState or int, optional

The generator or seed used to determine the starting vector for arpack iterations. Defaults to numpy.random.

Attributes

embedding_vectors_: Stores the embedding vectors
reconstruction_error_: Reconstruction error associated with embedding_vectors_
nbrs_: Stores nearest neighbors instance, including BallTree or KDtree if applicable.

References

[1]	Roweis, S. & Saul, L. Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323 (2000).

[2]	Donoho, D. & Grimes, C. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci U S A. 100:5591 (2003).

[3]	Zhang, Z. & Wang, J. MLLE: Modified Locally Linear Embedding Using Multiple Weights. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.70.382

[4]	Zhang, Z. & Zha, H. Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. Journal of Shanghai Univ. 8:406 (2004)

POSSIBLE NODE NAMES:
	LocallyLinearEmbeddingTransformerSklearn LocallyLinearEmbeddingTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LogisticRegressionCVClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LogisticRegressionCVClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Logistic Regression CV (aka logit, MaxEnt) classifier.

This node has been automatically generated by wrapping the sklearn.linear_model.logistic.LogisticRegressionCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This class implements logistic regression using liblinear, newton-cg, sag of lbfgs optimizer. The newton-cg, sag and lbfgs solvers support only L2 regularization with primal formulation. The liblinear solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty.

For the grid of Cs values (that are set by default to be ten values in a logarithmic scale between 1e-4 and 1e4), the best hyperparameter is selected by the cross-validator StratifiedKFold, but it can be changed using the cv parameter. In the case of newton-cg and lbfgs solvers, we warm start along the path i.e guess the initial coefficients of the present fit to be the coefficients got after convergence in the previous fit, so it is supposed to be faster for high-dimensional dense data.

For a multiclass problem, the hyperparameters for each class are computed using the best scores got by doing a one-vs-rest in parallel across all folds and classes. Hence this is not the true multinomial loss.

Read more in the User Guide.

Parameters

Cs

:list of floats | intEach of the values in Cs describes the inverse of regularization strength. If Cs is as an int, then a grid of Cs values are chosen in a logarithmic scale between 1e-4 and 1e4. Like in support vector machines, smaller values specify stronger regularization.

fit_intercept

:bool, default: TrueSpecifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

class_weight

:dict or ‘balanced’, optional

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

New in version 0.17: class_weight == ‘balanced’

cv

:integer or cross-validation generatorThe default cross-validation generator used is Stratified K-Folds. If an integer is provided, then it is the number of folds used. See the module sklearn.cross_validation module for the list of possible cross-validation objects.

penalty

:str, ‘l1’ or ‘l2’Used to specify the norm used in the penalization. The newton-cg and lbfgs solvers support only l2 penalties.

dual

:boolDual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.

scoring

:callabaleScoring function to use as cross-validation criteria. For a list of scoring functions that can be used, look at sklearn.metrics. The default scoring option used is accuracy_score.

solver

:{‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’}

Algorithm to use in the optimization problem.

For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ is

faster for large ones.
For multiclass problems, only ‘newton-cg’ and ‘lbfgs’ handle

multinomial loss; ‘sag’ and ‘liblinear’ are limited to one-versus-rest schemes.
‘newton-cg’, ‘lbfgs’ and ‘sag’ only handle L2 penalty.
‘liblinear’ might be slower in LogisticRegressionCV because it does

not handle warm-starting.

tol

:float, optionalTolerance for stopping criteria.

max_iter

:int, optionalMaximum number of iterations of the optimization algorithm.

n_jobs

:int, optionalNumber of CPU cores used during the cross-validation loop. If given a value of -1, all cores are used.

verbose

:intFor the ‘liblinear’, ‘sag’ and ‘lbfgs’ solvers set verbose to any positive number for verbosity.

refit

:boolIf set to True, the scores are averaged across all folds, and the coefs and the C that corresponds to the best score is taken, and a final refit is done using these parameters. Otherwise the coefs, intercepts and C that correspond to the best scores across folds are averaged.

multi_class

:str, {‘ovr’, ‘multinomial’}Multiclass option can be either ‘ovr’ or ‘multinomial’. If the option chosen is ‘ovr’, then a binary problem is fit for each label. Else the loss minimised is the multinomial loss fit across the entire probability distribution. Works only for ‘lbfgs’ and ‘newton-cg’ solvers.

intercept_scaling

:float, default 1.Useful only if solver is liblinear. This parameter is useful only when the solver ‘liblinear’ is used and self.fit_intercept is set to True. In this case, x becomes [x, self.intercept_scaling], i.e. a “synthetic” feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.

random_state

:int seed, RandomState instance, or None (default)The seed of the pseudo random number generator to use when shuffling the data.

Attributes

coef_

:array, shape (1, n_features) or (n_classes, n_features)

Coefficient of the features in the decision function.

coef_ is of shape (1, n_features) when the given problem is binary. coef_ is readonly property derived from raw_coef_ that follows the internal memory layout of liblinear.

intercept_

:array, shape (1,) or (n_classes,)Intercept (a.k.a. bias) added to the decision function. It is available only when parameter intercept is set to True and is of shape(1,) when the problem is binary.

Cs_

:arrayArray of C i.e. inverse of regularization parameter values used for cross-validation.

coefs_paths_

:array, shape (n_folds, len(Cs_), n_features) or (n_folds, len(Cs_), n_features + 1)dict with classes as the keys, and the path of coefficients obtained during cross-validating across each fold and then across each Cs after doing an OvR for the corresponding class as values. If the ‘multi_class’ option is set to ‘multinomial’, then the coefs_paths are the coefficients corresponding to each class. Each dict value has shape (n_folds, len(Cs_), n_features) or (n_folds, len(Cs_), n_features + 1) depending on whether the intercept is fit or not.

scores_

:dictdict with classes as the keys, and the values as the grid of scores obtained during cross-validating each fold, after doing an OvR for the corresponding class. If the ‘multi_class’ option given is ‘multinomial’ then the same scores are repeated across all classes, since this is the multinomial class. Each dict value has shape (n_folds, len(Cs))

C_

:array, shape (n_classes,) or (n_classes - 1,)Array of C that maps to the best scores across every class. If refit is set to False, then for each class, the best C is the average of the C’s that correspond to the best scores for each fold.

n_iter_

:array, shape (n_classes, n_folds, n_cs) or (1, n_folds, n_cs)Actual number of iterations for all classes, folds and Cs. In the binary or multinomial cases, the first dimension is equal to 1.

See also

LogisticRegression

POSSIBLE NODE NAMES:
	LogisticRegressionCVClassifierSklearnNode LogisticRegressionCVClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.LogisticRegressionClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.LogisticRegressionClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Logistic Regression (aka logit, MaxEnt) classifier.

This node has been automatically generated by wrapping the sklearn.linear_model.logistic.LogisticRegression class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the ‘multi_class’ option is set to ‘ovr’ and uses the cross-entropy loss, if the ‘multi_class’ option is set to ‘multinomial’. (Currently the ‘multinomial’ option is supported only by the ‘lbfgs’ and ‘newton-cg’ solvers.)

This class implements regularized logistic regression using the liblinear library, newton-cg and lbfgs solvers. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied).

The newton-cg and lbfgs solvers support only L2 regularization with primal formulation. The liblinear solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty.

Read more in the User Guide.

Parameters

penalty

:str, ‘l1’ or ‘l2’Used to specify the norm used in the penalization. The newton-cg and lbfgs solvers support only l2 penalties.

dual

:boolDual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.

C

:float, optional (default=1.0)Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.

fit_intercept

:bool, default: TrueSpecifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

intercept_scaling

:float, default: 1Useful only if solver is liblinear. when self.fit_intercept is True, instance vector x becomes [x, self.intercept_scaling], i.e. a “synthetic” feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.

class_weight

:dict or ‘balanced’, optional

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

New in version 0.17: class_weight=’balanced’ instead of deprecated class_weight=’auto’.

max_iter

:intUseful only for the newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge.

random_state

:int seed, RandomState instance, or None (default)The seed of the pseudo random number generator to use when shuffling the data.

solver

:{‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’}

Algorithm to use in the optimization problem.

For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ is

faster for large ones.
For multiclass problems, only ‘newton-cg’ and ‘lbfgs’ handle

multinomial loss; ‘sag’ and ‘liblinear’ are limited to one-versus-rest schemes.
‘newton-cg’, ‘lbfgs’ and ‘sag’ only handle L2 penalty.

Note that ‘sag’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

New in version 0.17: Stochastic Average Gradient descent solver.

tol

:float, optionalTolerance for stopping criteria.

multi_class

:str, {‘ovr’, ‘multinomial’}Multiclass option can be either ‘ovr’ or ‘multinomial’. If the option chosen is ‘ovr’, then a binary problem is fit for each label. Else the loss minimised is the multinomial loss fit across the entire probability distribution. Works only for the ‘lbfgs’ solver.

verbose

:intFor the liblinear and lbfgs solvers set verbose to any positive number for verbosity.

warm_start

:bool, optional

When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver.

New in version 0.17: warm_start to support lbfgs, newton-cg, sag solvers.

n_jobs

:int, optionalNumber of CPU cores used during the cross-validation loop. If given a value of -1, all cores are used.

Attributes

coef_: Coefficient of the features in the decision function.
intercept_: Intercept (a.k.a. bias) added to the decision function. If fit_intercept is set to False, the intercept is set to zero.
n_iter_: Actual number of iterations for all classes. If binary or multinomial, it returns only 1 element. For liblinear solver, only the maximum number of iteration across all classes is given.

See also

SGDClassifier: the parameter loss="log").

sklearn.svm.LinearSVC : learns SVM models using the same algorithm.

Notes

The underlying C implementation uses a random number generator to select features when fitting the model. It is thus not uncommon, to have slightly different results for the same input data. If that happens, try with a smaller tol parameter.

Predict output may not match that of standalone liblinear in certain cases. See differences from liblinear in the narrative documentation.

References

LIBLINEAR – A Library for Large Linear Classification: http://www.csie.ntu.edu.tw/~cjlin/liblinear/
Hsiang-Fu Yu, Fang-Lan Huang, Chih-Jen Lin (2011). Dual coordinate descent: methods for logistic regression and maximum entropy models. Machine Learning 85(1-2):41-75. http://www.csie.ntu.edu.tw/~cjlin/papers/maxent_dual.pdf

POSSIBLE NODE NAMES:
	LogisticRegressionClassifierSklearn LogisticRegressionClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MaxAbsScalerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MaxAbsScalerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Scale each feature by its maximum absolute value.

This node has been automatically generated by wrapping the sklearn.preprocessing.data.MaxAbsScaler class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This estimator scales and translates each feature individually such that the maximal absolute value of each feature in the training set will be 1.0. It does not shift/center the data, and thus does not destroy any sparsity.

This scaler can also be applied to sparse CSR or CSC matrices.

New in version 0.17.

Parameters

copy: Set to False to perform inplace scaling and avoid a copy (if the input is already a numpy array).

Attributes

scale_: Per feature relative scaling of the data.

New in version 0.17: scale_ attribute.
max_abs_: Per feature maximum absolute value.
n_samples_seen_: The number of samples processed by the estimator. Will be reset on new calls to fit, but increments across partial_fit calls.

POSSIBLE NODE NAMES:
	MaxAbsScalerTransformerSklearnNode MaxAbsScalerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MinMaxScalerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MinMaxScalerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Transforms features by scaling each feature to a given range.

This node has been automatically generated by wrapping the sklearn.preprocessing.data.MinMaxScaler class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This estimator scales and translates each feature individually such that it is in the given range on the training set, i.e. between zero and one.

The transformation is given by:

X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0))
X_scaled = X_std * (max - min) + min

where min, max = feature_range.

This transformation is often used as an alternative to zero mean, unit variance scaling.

Read more in the User Guide.

Parameters

feature_range: tuple (min, max), default=(0, 1): Desired range of transformed data.
copy: Set to False to perform inplace row normalization and avoid a copy (if the input is already a numpy array).

Attributes

min_: Per feature adjustment for minimum.
scale_: Per feature relative scaling of the data.

New in version 0.17: scale_ attribute.
data_min_: Per feature minimum seen in the data

New in version 0.17: data_min_ instead of deprecated data_min.
data_max_: Per feature maximum seen in the data

New in version 0.17: data_max_ instead of deprecated data_max.
data_range_: Per feature range (data_max_ - data_min_) seen in the data

New in version 0.17: data_range_ instead of deprecated data_range.

POSSIBLE NODE NAMES:
	MinMaxScalerTransformerSklearnNode MinMaxScalerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MiniBatchDictionaryLearningTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MiniBatchDictionaryLearningTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Mini-batch dictionary learning

This node has been automatically generated by wrapping the sklearn.decomposition.dict_learning.MiniBatchDictionaryLearning class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Finds a dictionary (a set of atoms) that can best be used to represent data using a sparse code.

Solves the optimization problem:

(U^*,V^*) = argmin 0.5 || Y - U V ||_2^2 + alpha * || U ||_1
             (U,V)
             with || V_k ||_2 = 1 for all  0 <= k < n_components

Read more in the User Guide.

Parameters

n_components: number of dictionary elements to extract
alpha: sparsity controlling parameter
n_iter: total number of iterations to perform
fit_algorithm: lars: uses the least angle regression method to solve the lasso problem (linear_model.lars_path) cd: uses the coordinate descent method to compute the Lasso solution (linear_model.Lasso). Lars will be faster if the estimated components are sparse.
transform_algorithm: Algorithm used to transform the data. lars: uses the least angle regression method (linear_model.lars_path) lasso_lars: uses Lars to compute the Lasso solution lasso_cd: uses the coordinate descent method to compute the Lasso solution (linear_model.Lasso). lasso_lars will be faster if the estimated components are sparse. omp: uses orthogonal matching pursuit to estimate the sparse solution threshold: squashes to zero all coefficients less than alpha from the projection dictionary * X’
transform_n_nonzero_coefs: Number of nonzero coefficients to target in each column of the solution. This is only used by algorithm=’lars’ and algorithm=’omp’ and is overridden by alpha in the omp case.
transform_alpha: If algorithm=’lasso_lars’ or algorithm=’lasso_cd’, alpha is the penalty applied to the L1 norm. If algorithm=’threshold’, alpha is the absolute value of the threshold below which coefficients will be squashed to zero. If algorithm=’omp’, alpha is the tolerance parameter: the value of the reconstruction error targeted. In this case, it overrides n_nonzero_coefs.
split_sign: Whether to split the sparse feature vector into the concatenation of its negative part and its positive part. This can improve the performance of downstream classifiers.
n_jobs: number of parallel jobs to run
dict_init: initial value of the dictionary for warm restart scenarios

verbose :

degree of verbosity of the printed output

batch_size: number of samples in each mini-batch
shuffle: whether to shuffle the samples before forming batches
random_state: Pseudo number generator state used for random sampling.

Attributes

components_: components extracted from the data
inner_stats_: Internal sufficient statistics that are kept by the algorithm. Keeping them is useful in online settings, to avoid loosing the history of the evolution, but they shouldn’t have any use for the end user. A (n_components, n_components) is the dictionary covariance matrix. B (n_features, n_components) is the data approximation matrix
n_iter_: Number of iterations run.

Notes

References:

J. Mairal, F. Bach, J. Ponce, G. Sapiro, 2009: Online dictionary learning for sparse coding (http://www.di.ens.fr/sierra/pdfs/icml09.pdf)

See also

SparseCoder DictionaryLearning SparsePCA MiniBatchSparsePCA

POSSIBLE NODE NAMES:
	MiniBatchDictionaryLearningTransformerSklearn MiniBatchDictionaryLearningTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MiniBatchSparsePCATransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MiniBatchSparsePCATransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Mini-batch Sparse Principal Components Analysis

This node has been automatically generated by wrapping the sklearn.decomposition.sparse_pca.MiniBatchSparsePCA class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Finds the set of sparse components that can optimally reconstruct the data. The amount of sparseness is controllable by the coefficient of the L1 penalty, given by the parameter alpha.

Read more in the User Guide.

Parameters

n_components: number of sparse atoms to extract
alpha: Sparsity controlling parameter. Higher values lead to sparser components.
ridge_alpha: Amount of ridge shrinkage to apply in order to improve conditioning when calling the transform method.
n_iter: number of iterations to perform for each mini batch
callback: callable that gets invoked every five iterations
batch_size: the number of features to take in each mini batch

verbose :

degree of output the procedure will print

shuffle: whether to shuffle the data before splitting it in batches
n_jobs: number of parallel jobs to run, or -1 to autodetect.
method: lars: uses the least angle regression method to solve the lasso problem (linear_model.lars_path) cd: uses the coordinate descent method to compute the Lasso solution (linear_model.Lasso). Lars will be faster if the estimated components are sparse.
random_state: Pseudo number generator state used for random sampling.

Attributes

components_: Sparse components extracted from the data.
error_: Vector of errors at each iteration.
n_iter_: Number of iterations run.

See also

PCA SparsePCA DictionaryLearning

POSSIBLE NODE NAMES:
	MiniBatchSparsePCATransformerSklearnNode MiniBatchSparsePCATransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MultiLabelBinarizerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MultiLabelBinarizerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Transform between iterable of iterables and a multilabel format

This node has been automatically generated by wrapping the sklearn.preprocessing.label.MultiLabelBinarizer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Although a list of sets or tuples is a very intuitive format for multilabel data, it is unwieldy to process. This transformer converts between this intuitive format and the supported multilabel format: a (samples x classes) binary matrix indicating the presence of a class label.

Parameters

classes: Indicates an ordering for the class labels
sparse_output: Set to true if output binary array is desired in CSR sparse format

Attributes

classes_: A copy of the classes parameter where provided, or otherwise, the sorted set of classes found when fitting.

Examples

>>> mlb = MultiLabelBinarizer()
>>> mlb.fit_transform([(1, 2), (3,)])
array([[1, 1, 0],
       [0, 0, 1]])
>>> mlb.classes_
array([1, 2, 3])

>>> mlb.fit_transform([set(['sci-fi', 'thriller']), set(['comedy'])])
array([[0, 1, 1],
       [1, 0, 0]])
>>> list(mlb.classes_)
['comedy', 'sci-fi', 'thriller']

POSSIBLE NODE NAMES:
	MultiLabelBinarizerTransformerSklearnNode MultiLabelBinarizerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MultiTaskElasticNetCVRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MultiTaskElasticNetCVRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Multi-task L1/L2 ElasticNet with built-in cross-validation.

This node has been automatically generated by wrapping the sklearn.linear_model.coordinate_descent.MultiTaskElasticNetCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The optimization objective for MultiTaskElasticNet is:

(1 / (2 * n_samples)) * ||Y - XW||^Fro_2
+ alpha * l1_ratio * ||W||_21
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2

Where:

||W||_21 = \sum_i \sqrt{\sum_j w_{ij}^2}

i.e. the sum of norm of each row.

Read more in the User Guide.

Parameters

eps

:float, optionalLength of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

alphas

:array-like, optionalList of alphas where to compute the models. If not provided, set automatically.

n_alphas

:int, optionalNumber of alphas along the regularization path

l1_ratio

:float or array of floatsThe ElasticNet mixing parameter, with 0 < l1_ratio <= 1. For l1_ratio = 0 the penalty is an L1/L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1/L2 and L2. This parameter can be a list, in which case the different values are tested by cross-validation and the one giving the best prediction score is used. Note that a good choice of list of values for l1_ratio is often to put more values close to 1 (i.e. Lasso) and less close to 0 (i.e. Ridge), as in

[.1, .5, .7,
.9, .95, .99, 1]

fit_intercept

:booleanwhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

copy_X

:boolean, optional, default TrueIf True, X will be copied; else, it may be overwritten.

max_iter

:int, optionalThe maximum number of iterations

tol

:float, optionalThe tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

verbose

:bool or integerAmount of verbosity.

n_jobs

:integer, optionalNumber of CPUs to use during the cross validation. If -1, use all the CPUs. Note that this is used only if multiple values for l1_ratio are given.

selection

:str, default ‘cyclic’If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to ‘random’) often leads to significantly faster convergence especially when tol is higher than 1e-4.

random_state

:int, RandomState instance, or None (default)The seed of the pseudo random number generator that selects a random feature to update. Useful only when selection is set to ‘random’.

Attributes

intercept_: Independent term in decision function.
coef_: Parameter vector (W in the cost function formula).
alpha_: The amount of penalization chosen by cross validation
mse_path_: mean square error for the test set on each fold, varying alpha
alphas_: The grid of alphas used for fitting, for each l1_ratio
l1_ratio_: best l1_ratio obtained by cross-validation.
n_iter_: number of iterations run by the coordinate descent solver to reach the specified tolerance for the optimal alpha.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.MultiTaskElasticNetCV()
>>> clf.fit([[0,0], [1, 1], [2, 2]],
...         [[0, 0], [1, 1], [2, 2]])
... 
MultiTaskElasticNetCV(alphas=None, copy_X=True, cv=None, eps=0.001,
       fit_intercept=True, l1_ratio=0.5, max_iter=1000, n_alphas=100,
       n_jobs=1, normalize=False, random_state=None, selection='cyclic',
       tol=0.0001, verbose=0)
>>> print(clf.coef_)
[[ 0.52875032  0.46958558]
 [ 0.52875032  0.46958558]]
>>> print(clf.intercept_)
[ 0.00166409  0.00166409]

See also

MultiTaskElasticNet ElasticNetCV MultiTaskLassoCV

Notes

The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

POSSIBLE NODE NAMES:
	MultiTaskElasticNetCVRegressorSklearn MultiTaskElasticNetCVRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MultiTaskElasticNetRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MultiTaskElasticNetRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Multi-task ElasticNet model trained with L1/L2 mixed-norm as regularizer

This node has been automatically generated by wrapping the sklearn.linear_model.coordinate_descent.MultiTaskElasticNet class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The optimization objective for MultiTaskElasticNet is:

(1 / (2 * n_samples)) * ||Y - XW||^Fro_2
+ alpha * l1_ratio * ||W||_21
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2

Where:

||W||_21 = \sum_i \sqrt{\sum_j w_{ij}^2}

i.e. the sum of norm of each row.

Read more in the User Guide.

Parameters

alpha: Constant that multiplies the L1/L2 term. Defaults to 1.0
l1_ratio: The ElasticNet mixing parameter, with 0 < l1_ratio <= 1. For l1_ratio = 0 the penalty is an L1/L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1/L2 and L2.
fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalize: If True, the regressors X will be normalized before regression.
copy_X: If True, X will be copied; else, it may be overwritten.
max_iter: The maximum number of iterations
tol: The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.
warm_start: When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.
selection: If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to ‘random’) often leads to significantly faster convergence especially when tol is higher than 1e-4.
random_state: The seed of the pseudo random number generator that selects a random feature to update. Useful only when selection is set to ‘random’.

Attributes

intercept_: Independent term in decision function.
coef_: Parameter vector (W in the cost function formula). If a 1D y is passed in at fit (non multi-task usage), coef_ is then a 1D array
n_iter_: number of iterations run by the coordinate descent solver to reach the specified tolerance.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.MultiTaskElasticNet(alpha=0.1)
>>> clf.fit([[0,0], [1, 1], [2, 2]], [[0, 0], [1, 1], [2, 2]])
... 
MultiTaskElasticNet(alpha=0.1, copy_X=True, fit_intercept=True,
        l1_ratio=0.5, max_iter=1000, normalize=False, random_state=None,
        selection='cyclic', tol=0.0001, warm_start=False)
>>> print(clf.coef_)
[[ 0.45663524  0.45612256]
 [ 0.45663524  0.45612256]]
>>> print(clf.intercept_)
[ 0.0872422  0.0872422]

See also

ElasticNet, MultiTaskLasso

Notes

The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

POSSIBLE NODE NAMES:
	MultiTaskElasticNetRegressorSklearn MultiTaskElasticNetRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MultiTaskLassoCVRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MultiTaskLassoCVRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Multi-task L1/L2 Lasso with built-in cross-validation.

This node has been automatically generated by wrapping the sklearn.linear_model.coordinate_descent.MultiTaskLassoCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The optimization objective for MultiTaskLasso is:

(1 / (2 * n_samples)) * ||Y - XW||^Fro_2 + alpha * ||W||_21

Where:

||W||_21 = \sum_i \sqrt{\sum_j w_{ij}^2}

i.e. the sum of norm of each row.

Read more in the User Guide.

Parameters

eps

:float, optionalLength of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

alphas

:array-like, optionalList of alphas where to compute the models. If not provided, set automaticlly.

n_alphas

:int, optionalNumber of alphas along the regularization path

fit_intercept

:booleanwhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

copy_X

:boolean, optional, default TrueIf True, X will be copied; else, it may be overwritten.

max_iter

:int, optionalThe maximum number of iterations.

tol

:float, optionalThe tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

verbose

:bool or integerAmount of verbosity.

n_jobs

:integer, optionalNumber of CPUs to use during the cross validation. If -1, use all the CPUs. Note that this is used only if multiple values for l1_ratio are given.

selection

:str, default ‘cyclic’If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to ‘random’) often leads to significantly faster convergence especially when tol is higher than 1e-4.

random_state

:int, RandomState instance, or None (default)The seed of the pseudo random number generator that selects a random feature to update. Useful only when selection is set to ‘random’.

Attributes

intercept_: Independent term in decision function.
coef_: Parameter vector (W in the cost function formula).
alpha_: The amount of penalization chosen by cross validation
mse_path_: mean square error for the test set on each fold, varying alpha
alphas_: The grid of alphas used for fitting.
n_iter_: number of iterations run by the coordinate descent solver to reach the specified tolerance for the optimal alpha.

See also

MultiTaskElasticNet ElasticNetCV MultiTaskElasticNetCV

Notes

The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

POSSIBLE NODE NAMES:
	MultiTaskLassoCVRegressorSklearn MultiTaskLassoCVRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MultiTaskLassoRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MultiTaskLassoRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Multi-task Lasso model trained with L1/L2 mixed-norm as regularizer

This node has been automatically generated by wrapping the sklearn.linear_model.coordinate_descent.MultiTaskLasso class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The optimization objective for Lasso is:

(1 / (2 * n_samples)) * ||Y - XW||^2_Fro + alpha * ||W||_21

Where:

||W||_21 = \sum_i \sqrt{\sum_j w_{ij}^2}

i.e. the sum of norm of earch row.

Read more in the User Guide.

Parameters

alpha: Constant that multiplies the L1/L2 term. Defaults to 1.0
fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalize: If True, the regressors X will be normalized before regression.
copy_X: If True, X will be copied; else, it may be overwritten.
max_iter: The maximum number of iterations
tol: The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.
warm_start: When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.
selection: If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to ‘random’) often leads to significantly faster convergence especially when tol is higher than 1e-4
random_state: The seed of the pseudo random number generator that selects a random feature to update. Useful only when selection is set to ‘random’.

Attributes

coef_: parameter vector (W in the cost function formula)
intercept_: independent term in decision function.
n_iter_: number of iterations run by the coordinate descent solver to reach the specified tolerance.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.MultiTaskLasso(alpha=0.1)
>>> clf.fit([[0,0], [1, 1], [2, 2]], [[0, 0], [1, 1], [2, 2]])
MultiTaskLasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=1000,
        normalize=False, random_state=None, selection='cyclic', tol=0.0001,
        warm_start=False)
>>> print(clf.coef_)
[[ 0.89393398  0.        ]
 [ 0.89393398  0.        ]]
>>> print(clf.intercept_)
[ 0.10606602  0.10606602]

See also

Lasso, MultiTaskElasticNet

Notes

The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

POSSIBLE NODE NAMES:
	MultiTaskLassoRegressorSklearnNode MultiTaskLassoRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.MultinomialNBClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.MultinomialNBClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Naive Bayes classifier for multinomial models

This node has been automatically generated by wrapping the sklearn.naive_bayes.MultinomialNB class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The multinomial Naive Bayes classifier is suitable for classification with discrete features (e.g., word counts for text classification). The multinomial distribution normally requires integer feature counts. However, in practice, fractional counts such as tf-idf may also work.

Read more in the User Guide.

Parameters

alpha: Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing).
fit_prior: Whether to learn class prior probabilities or not. If false, a uniform prior will be used.
class_prior: Prior probabilities of the classes. If specified the priors are not adjusted according to the data.

Attributes

class_log_prior_: Smoothed empirical log probability for each class.
intercept_: Mirrors class_log_prior_ for interpreting MultinomialNB as a linear model.
feature_log_prob_: Empirical log probability of features given a class, P(x_i|y).
coef_: Mirrors feature_log_prob_ for interpreting MultinomialNB as a linear model.
class_count_: Number of samples encountered for each class during fitting. This value is weighted by the sample weight when provided.
feature_count_: Number of samples encountered for each (class, feature) during fitting. This value is weighted by the sample weight when provided.

Examples

>>> import numpy as np
>>> X = np.random.randint(5, size=(6, 100))
>>> y = np.array([1, 2, 3, 4, 5, 6])
>>> from sklearn.naive_bayes import MultinomialNB
>>> clf = MultinomialNB()
>>> clf.fit(X, y)
MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)
>>> print(clf.predict(X[2:3]))
[3]

Notes

For the rationale behind the names coef_ and intercept_, i.e. naive Bayes as a linear classifier, see J. Rennie et al. (2003), Tackling the poor assumptions of naive Bayes text classifiers, ICML.

References

C.D. Manning, P. Raghavan and H. Schuetze (2008). Introduction to Information Retrieval. Cambridge University Press, pp. 234-265. http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html

POSSIBLE NODE NAMES:
	MultinomialNBClassifierSklearn MultinomialNBClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.NMFTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.NMFTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Non-Negative Matrix Factorization (NMF)

This node has been automatically generated by wrapping the sklearn.decomposition.nmf.NMF class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Find two non-negative matrices (W, H) whose product approximates the non- negative matrix X. This factorization can be used for example for dimensionality reduction, source separation or topic extraction.

The objective function is:

0.5 * ||X - WH||_Fro^2
+ alpha * l1_ratio * ||vec(W)||_1
+ alpha * l1_ratio * ||vec(H)||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2
+ 0.5 * alpha * (1 - l1_ratio) * ||H||_Fro^2

Where:

||A||_Fro^2 = \sum_{i,j} A_{ij}^2 (Frobenius norm)
||vec(A)||_1 = \sum_{i,j} abs(A_{ij}) (Elementwise L1 norm)

The objective function is minimized with an alternating minimization of W and H.

Read more in the User Guide.

Parameters

n_components

:int or NoneNumber of components, if n_components is not set all features are kept.

init

:‘random’ | ‘nndsvd’ | ‘nndsvda’ | ‘nndsvdar’ | ‘custom’

Method used to initialize the procedure. Default: ‘nndsvdar’ if n_components < n_features, otherwise random. Valid options:

‘random’: non-negative random matrices, scaled with:
- sqrt(X.mean() / n_components)
‘nndsvd’: Nonnegative Double Singular Value Decomposition (NNDSVD)

initialization (better for sparseness)
‘nndsvda’: NNDSVD with zeros filled with the average of X

(better when sparsity is not desired)
‘nndsvdar’: NNDSVD with zeros filled with small random values

(generally faster, less accurate alternative to NNDSVDa for when sparsity is not desired)
‘custom’: use custom matrices W and H

solver

:‘pg’ | ‘cd’

Numerical solver to use:

‘pg’ is a Projected Gradient solver (deprecated).
‘cd’ is a Coordinate Descent solver (recommended).

New in version 0.17: Coordinate Descent solver.

Changed in version 0.17: Deprecated Projected Gradient solver.

tol

:double, default: 1e-4Tolerance value used in stopping conditions.

max_iter

:integer, default: 200Number of iterations to compute.

random_state

:integer seed, RandomState instance, or None (default)Random number generator seed control.

alpha

:double, default: 0.

Constant that multiplies the regularization terms. Set it to zero to have no regularization.

New in version 0.17: alpha used in the Coordinate Descent solver.

l1_ratio

:double, default: 0.

The regularization mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an elementwise L2 penalty (aka Frobenius Norm). For l1_ratio = 1 it is an elementwise L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.

New in version 0.17: Regularization parameter l1_ratio used in the Coordinate Descent solver.

shuffle

:boolean, default: False

If true, randomize the order of coordinates in the CD solver.

New in version 0.17: shuffle parameter used in the Coordinate Descent solver.

nls_max_iter

:integer, default: 2000

Number of iterations in NLS subproblem. Used only in the deprecated ‘pg’ solver.

Changed in version 0.17: Deprecated Projected Gradient solver. Use Coordinate Descent solver instead.

sparseness

:‘data’ | ‘components’ | None, default: None

Where to enforce sparsity in the model. Used only in the deprecated ‘pg’ solver.

Changed in version 0.17: Deprecated Projected Gradient solver. Use Coordinate Descent solver instead.

beta

:double, default: 1

Degree of sparseness, if sparseness is not None. Larger values mean more sparseness. Used only in the deprecated ‘pg’ solver.

Changed in version 0.17: Deprecated Projected Gradient solver. Use Coordinate Descent solver instead.

eta

:double, default: 0.1

Degree of correctness to maintain, if sparsity is not None. Smaller values mean larger error. Used only in the deprecated ‘pg’ solver.

Changed in version 0.17: Deprecated Projected Gradient solver. Use Coordinate Descent solver instead.

Attributes

components_: Non-negative components of the data.
reconstruction_err_: Frobenius norm of the matrix difference between the training data and the reconstructed data from the fit produced by the model. || X - WH ||_2
n_iter_: Actual number of iterations.

Examples

>>> import numpy as np
>>> X = np.array([[1,1], [2, 1], [3, 1.2], [4, 1], [5, 0.8], [6, 1]])
>>> from sklearn.decomposition import NMF
>>> model = NMF(n_components=2, init='random', random_state=0)
>>> model.fit(X) 
NMF(alpha=0.0, beta=1, eta=0.1, init='random', l1_ratio=0.0, max_iter=200,
  n_components=2, nls_max_iter=2000, random_state=0, shuffle=False,
  solver='cd', sparseness=None, tol=0.0001, verbose=0)

>>> model.components_
array([[ 2.09783018,  0.30560234],
       [ 2.13443044,  2.13171694]])
>>> model.reconstruction_err_ 
0.00115993...

References

C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation, 19(2007), 2756-2779. http://www.csie.ntu.edu.tw/~cjlin/nmf/

Cichocki, Andrzej, and P. H. A. N. Anh-Huy. “Fast local algorithms for large scale nonnegative matrix and tensor factorizations.” IEICE transactions on fundamentals of electronics, communications and computer sciences 92.3: 708-721, 2009.

POSSIBLE NODE NAMES:
	NMFTransformerSklearn NMFTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.NearestCentroidClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.NearestCentroidClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Nearest centroid classifier.

This node has been automatically generated by wrapping the sklearn.neighbors.nearest_centroid.NearestCentroid class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Each class is represented by its centroid, with test samples classified to the class with the nearest centroid.

Read more in the User Guide.

Parameters

metric: string, or callable: The metric to use when calculating distance between instances in a feature array. If metric is a string or callable, it must be one of the options allowed by metrics.pairwise.pairwise_distances for its metric parameter. The centroids for the samples corresponding to each class is the point from which the sum of the distances (according to the metric) of all samples that belong to that particular class are minimized. If the “manhattan” metric is provided, this centroid is the median and for all other metrics, the centroid is now set to be the mean.
shrink_threshold: Threshold for shrinking centroids to remove features.

Attributes

centroids_: Centroid of each class

Examples

>>> from sklearn.neighbors.nearest_centroid import NearestCentroid
>>> import numpy as np
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> y = np.array([1, 1, 1, 2, 2, 2])
>>> clf = NearestCentroid()
>>> clf.fit(X, y)
NearestCentroid(metric='euclidean', shrink_threshold=None)
>>> print(clf.predict([[-0.8, -1]]))
[1]

See also

sklearn.neighbors.KNeighborsClassifier: nearest neighbors classifier

Notes

When used for text classification with tf-idf vectors, this classifier is also known as the Rocchio classifier.

References

Tibshirani, R., Hastie, T., Narasimhan, B., & Chu, G. (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proceedings of the National Academy of Sciences of the United States of America, 99(10), 6567-6572. The National Academy of Sciences.

POSSIBLE NODE NAMES:
	NearestCentroidClassifierSklearnNode NearestCentroidClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.NormalizerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.NormalizerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Normalize samples individually to unit norm.

This node has been automatically generated by wrapping the sklearn.preprocessing.data.Normalizer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Each sample (i.e. each row of the data matrix) with at least one non zero component is rescaled independently of other samples so that its norm (l1 or l2) equals one.

This transformer is able to work both with dense numpy arrays and scipy.sparse matrix (use CSR format if you want to avoid the burden of a copy / conversion).

Scaling inputs to unit norms is a common operation for text classification or clustering for instance. For instance the dot product of two l2-normalized TF-IDF vectors is the cosine similarity of the vectors and is the base similarity metric for the Vector Space Model commonly used by the Information Retrieval community.

Read more in the User Guide.

Parameters

norm: The norm to use to normalize each non zero sample.
copy: set to False to perform inplace row normalization and avoid a copy (if the input is already a numpy array or a scipy.sparse CSR matrix).

Notes

This estimator is stateless (besides constructor parameters), the fit method does nothing but is useful when used in a pipeline.

See also

sklearn.preprocessing.normalize() equivalent function without the object oriented API

POSSIBLE NODE NAMES:
	NormalizerTransformerSklearnNode NormalizerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.NuSVCClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.NuSVCClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Nu-Support Vector Classification.

This node has been automatically generated by wrapping the sklearn.svm.classes.NuSVC class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Similar to SVC but uses a parameter to control the number of support vectors.

The implementation is based on libsvm.

Read more in the User Guide.

Parameters

nu: An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Should be in the interval (0, 1].
kernel: Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. If a callable is given it is used to precompute the kernel matrix.
degree: Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
gamma: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. If gamma is ‘auto’ then 1/n_features will be used instead.
coef0: Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’.
probability: Whether to enable probability estimates. This must be enabled prior to calling fit, and will slow down that method.
shrinking: Whether to use the shrinking heuristic.
tol: Tolerance for stopping criterion.
cache_size: Specify the size of the kernel cache (in MB).
class_weight: Set the parameter C of class i to class_weight[i]*C for SVC. If not given, all classes are supposed to have weight one. The ‘auto’ mode uses the values of y to automatically adjust weights inversely proportional to class frequencies.
verbose: Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in libsvm that, if enabled, may not work properly in a multithreaded context.
max_iter: Hard limit on iterations within solver, or -1 for no limit.
decision_function_shape: Whether to return a one-vs-rest (‘ovr’) ecision function of shape (n_samples, n_classes) as all other classifiers, or the original one-vs-one (‘ovo’) decision function of libsvm which has shape (n_samples, n_classes * (n_classes - 1) / 2). The default of None will currently behave as ‘ovo’ for backward compatibility and raise a deprecation warning, but will change ‘ovr’ in 0.18.

New in version 0.17: decision_function_shape=’ovr’ is recommended.

Changed in version 0.17: Deprecated decision_function_shape=’ovo’ and None.
random_state: The seed of the pseudo random number generator to use when shuffling the data for probability estimation.

Attributes

support_

:array-like, shape = [n_SV]Indices of support vectors.

support_vectors_

:array-like, shape = [n_SV, n_features]Support vectors.

n_support_

:array-like, dtype=int32, shape = [n_class]Number of support vectors for each class.

dual_coef_

:array, shape = [n_class-1, n_SV]Coefficients of the support vector in the decision function. For multiclass, coefficient for all 1-vs-1 classifiers. The layout of the coefficients in the multiclass case is somewhat non-trivial. See the section about multi-class classification in the SVM section of the User Guide for details.

coef_

:array, shape = [n_class-1, n_features]

Weights assigned to the features (coefficients in the primal problem). This is only available in the case of a linear kernel.

coef_ is readonly property derived from dual_coef_ and support_vectors_.

intercept_

:array, shape = [n_class * (n_class-1) / 2]Constants in decision function.

Examples

>>> import numpy as np
>>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
>>> y = np.array([1, 1, 2, 2])
>>> from sklearn.svm import NuSVC
>>> clf = NuSVC()
>>> clf.fit(X, y) 
NuSVC(cache_size=200, class_weight=None, coef0=0.0,
      decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
      max_iter=-1, nu=0.5, probability=False, random_state=None,
      shrinking=True, tol=0.001, verbose=False)
>>> print(clf.predict([[-0.8, -1]]))
[1]

See also

SVC: Support Vector Machine for classification using libsvm.
LinearSVC: Scalable linear Support Vector Machine for classification using liblinear.

POSSIBLE NODE NAMES:
	NuSVCClassifierSklearnNode NuSVCClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.NuSVRRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.NuSVRRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Nu Support Vector Regression.

This node has been automatically generated by wrapping the sklearn.svm.classes.NuSVR class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Similar to NuSVC, for regression, uses a parameter nu to control the number of support vectors. However, unlike NuSVC, where nu replaces C, here nu replaces the parameter epsilon of epsilon-SVR.

The implementation is based on libsvm.

Read more in the User Guide.

Parameters

C: Penalty parameter C of the error term.
nu: An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Should be in the interval (0, 1]. By default 0.5 will be taken.
kernel: Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. If a callable is given it is used to precompute the kernel matrix.
degree: Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
gamma: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. If gamma is ‘auto’ then 1/n_features will be used instead.
coef0: Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’.
shrinking: Whether to use the shrinking heuristic.
tol: Tolerance for stopping criterion.
cache_size: Specify the size of the kernel cache (in MB).
verbose: Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in libsvm that, if enabled, may not work properly in a multithreaded context.
max_iter: Hard limit on iterations within solver, or -1 for no limit.

Attributes

support_

:array-like, shape = [n_SV]Indices of support vectors.

support_vectors_

:array-like, shape = [nSV, n_features]Support vectors.

dual_coef_

:array, shape = [1, n_SV]Coefficients of the support vector in the decision function.

coef_

:array, shape = [1, n_features]

Weights assigned to the features (coefficients in the primal problem). This is only available in the case of a linear kernel.

coef_ is readonly property derived from dual_coef_ and support_vectors_.

intercept_

:array, shape = [1]Constants in decision function.

Examples

>>> from sklearn.svm import NuSVR
>>> import numpy as np
>>> n_samples, n_features = 10, 5
>>> np.random.seed(0)
>>> y = np.random.randn(n_samples)
>>> X = np.random.randn(n_samples, n_features)
>>> clf = NuSVR(C=1.0, nu=0.1)
>>> clf.fit(X, y)  
NuSVR(C=1.0, cache_size=200, coef0=0.0, degree=3, gamma='auto',
      kernel='rbf', max_iter=-1, nu=0.1, shrinking=True, tol=0.001,
      verbose=False)

See also

NuSVC: Support Vector Machine for classification implemented with libsvm with a parameter to control the number of support vectors.
SVR: epsilon Support Vector Machine for regression implemented with libsvm.

POSSIBLE NODE NAMES:
	NuSVRRegressorSklearn NuSVRRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.NystroemTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.NystroemTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Approximate a kernel map using a subset of the training data.

This node has been automatically generated by wrapping the sklearn.kernel_approximation.Nystroem class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Constructs an approximate feature map for an arbitrary kernel using a subset of the data as basis.

Read more in the User Guide.

Parameters

kernel: Kernel map to be approximated. A callable should accept two arguments and the keyword arguments passed to this object as kernel_params, and should return a floating point number.
n_components: Number of features to construct. How many data points will be used to construct the mapping.
gamma: Gamma parameter for the RBF, polynomial, exponential chi2 and sigmoid kernels. Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. Ignored by other kernels.
degree: Degree of the polynomial kernel. Ignored by other kernels.
coef0: Zero coefficient for polynomial and sigmoid kernels. Ignored by other kernels.
kernel_params: Additional parameters (keyword arguments) for kernel function passed as callable object.
random_state: If int, random_state is the seed used by the random number generator; if RandomState instance, random_state is the random number generator.

Attributes

components_: Subset of training points used to construct the feature map.
component_indices_: Indices of components_ in the training set.
normalization_: Normalization matrix needed for embedding. Square root of the kernel matrix on components_.

References

Williams, C.K.I. and Seeger, M. “Using the Nystroem method to speed up kernel machines”, Advances in neural information processing systems 2001
T. Yang, Y. Li, M. Mahdavi, R. Jin and Z. Zhou “Nystroem Method vs Random Fourier Features: A Theoretical and Empirical Comparison”, Advances in Neural Information Processing Systems 2012

See also

RBFSampler: features.

sklearn.metrics.pairwise.kernel_metrics : List of built-in kernels.

POSSIBLE NODE NAMES:
	NystroemTransformerSklearn NystroemTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.OneHotEncoderTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.OneHotEncoderTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Encode categorical integer features using a one-hot aka one-of-K scheme.

This node has been automatically generated by wrapping the sklearn.preprocessing.data.OneHotEncoder class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The input to this transformer should be a matrix of integers, denoting the values taken on by categorical (discrete) features. The output will be a sparse matrix where each column corresponds to one possible value of one feature. It is assumed that input features take on values in the range [0, n_values).

This encoding is needed for feeding categorical data to many scikit-learn estimators, notably linear models and SVMs with the standard kernels.

Read more in the User Guide.

Parameters

n_values

:‘auto’, int or array of ints

Number of values per feature.

‘auto’ : determine value range from training data.
int : maximum value for all features.
array : maximum value per feature.

categorical_features: “all” or array of indices or mask

Specify what features are treated as categorical.

‘all’ (default): All features are treated as categorical.
array of indices: Array of categorical feature indices.
mask: Array of length n_features and with dtype=bool.

Non-categorical features are always stacked to the right of the matrix.

dtype

:number type, default=np.floatDesired dtype of output.

sparse

:boolean, default=TrueWill return sparse matrix if set True else will return an array.

handle_unknown

:str, ‘error’ or ‘ignore’Whether to raise an error or ignore if a unknown categorical feature is present during transform.

Attributes

active_features_: Indices for active features, meaning values that actually occur in the training set. Only available when n_values is 'auto'.
feature_indices_: Indices to feature ranges. Feature i in the original data is mapped to features from feature_indices_[i] to feature_indices_[i+1] (and then potentially masked by active_features_ afterwards)
n_values_: Maximum number of values per feature.

Examples

Given a dataset with three features and two samples, we let the encoder find the maximum value per feature and transform the data to a binary one-hot encoding.

>>> from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder()
>>> enc.fit([[0, 0, 3], [1, 1, 0], [0, 2, 1], [1, 0, 2]])  
OneHotEncoder(categorical_features='all', dtype=<... 'float'>,
       handle_unknown='error', n_values='auto', sparse=True)
>>> enc.n_values_
array([2, 3, 4])
>>> enc.feature_indices_
array([0, 2, 5, 9])
>>> enc.transform([[0, 1, 1]]).toarray()
array([[ 1.,  0.,  0.,  1.,  0.,  0.,  1.,  0.,  0.]])

See also

sklearn.feature_extraction.DictVectorizer: dictionary items (also handles string-valued features).
sklearn.feature_extraction.FeatureHasher: encoding of dictionary items or strings.

POSSIBLE NODE NAMES:
	OneHotEncoderTransformerSklearn OneHotEncoderTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.OneVsOneClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.OneVsOneClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

One-vs-one multiclass strategy

This node has been automatically generated by wrapping the sklearn.multiclass.OneVsOneClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This strategy consists in fitting one classifier per class pair. At prediction time, the class which received the most votes is selected. Since it requires to fit n_classes * (n_classes - 1) / 2 classifiers, this method is usually slower than one-vs-the-rest, due to its O(n_classes^2) complexity. However, this method may be advantageous for algorithms such as kernel algorithms which don’t scale well with n_samples. This is because each individual learning problem only involves a small subset of the data whereas, with one-vs-the-rest, the complete dataset is used n_classes times.

Read more in the User Guide.

Parameters

estimator: An estimator object implementing fit and one of decision_function or predict_proba.
n_jobs: The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.

Attributes

estimators_: Estimators used for predictions.
classes_: Array containing labels.

POSSIBLE NODE NAMES:
	OneVsOneClassifierSklearn OneVsOneClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.OneVsRestClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.OneVsRestClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

One-vs-the-rest (OvR) multiclass/multilabel strategy

This node has been automatically generated by wrapping the sklearn.multiclass.OneVsRestClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Also known as one-vs-all, this strategy consists in fitting one classifier per class. For each classifier, the class is fitted against all the other classes. In addition to its computational efficiency (only n_classes classifiers are needed), one advantage of this approach is its interpretability. Since each class is represented by one and one classifier only, it is possible to gain knowledge about the class by inspecting its corresponding classifier. This is the most commonly used strategy for multiclass classification and is a fair default choice.

This strategy can also be used for multilabel learning, where a classifier is used to predict multiple labels for instance, by fitting on a 2-d matrix in which cell [i, j] is 1 if sample i has label j and 0 otherwise.

In the multilabel learning literature, OvR is also known as the binary relevance method.

Read more in the User Guide.

Parameters

estimator: An estimator object implementing fit and one of decision_function or predict_proba.
n_jobs: The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.

Attributes

estimators_: Estimators used for predictions.
classes_: Class labels.
label_binarizer_: Object used to transform multiclass labels to binary labels and vice-versa.
multilabel_: Whether a OneVsRestClassifier is a multilabel classifier.

POSSIBLE NODE NAMES:
	OneVsRestClassifierSklearn OneVsRestClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.OrthogonalMatchingPursuitCVRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.OrthogonalMatchingPursuitCVRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Cross-validated Orthogonal Matching Pursuit model (OMP)

This node has been automatically generated by wrapping the sklearn.linear_model.omp.OrthogonalMatchingPursuitCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Parameters

copy

:bool, optionalWhether the design matrix X must be copied by the algorithm. A false value is only helpful if X is already Fortran-ordered, otherwise a copy is made anyway.

fit_intercept

:boolean, optionalwhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

normalize

:boolean, optionalIf False, the regressors X are assumed to be already normalized.

max_iter

:integer, optionalMaximum numbers of iterations to perform, therefore maximum features to include. 10% of n_features but at least 5 if available.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

n_jobs

:integer, optionalNumber of CPUs to use during the cross validation. If -1, use all the CPUs

verbose

:boolean or integer, optionalSets the verbosity amount

Read more in the User Guide.

Attributes

intercept_: Independent term in decision function.
coef_: Parameter vector (w in the problem formulation).
n_nonzero_coefs_: Estimated number of non-zero coefficients giving the best mean squared error over the cross-validation folds.
n_iter_: Number of active features across every target for the model refit with the best hyperparameters got by cross-validating across all folds.

See also

orthogonal_mp orthogonal_mp_gram lars_path Lars LassoLars OrthogonalMatchingPursuit LarsCV LassoLarsCV decomposition.sparse_encode

POSSIBLE NODE NAMES:
	OrthogonalMatchingPursuitCVRegressorSklearnNode OrthogonalMatchingPursuitCVRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.OrthogonalMatchingPursuitRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.OrthogonalMatchingPursuitRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Orthogonal Matching Pursuit model (OMP)

This node has been automatically generated by wrapping the sklearn.linear_model.omp.OrthogonalMatchingPursuit class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Parameters

n_nonzero_coefs: Desired number of non-zero entries in the solution. If None (by default) this value is set to 10% of n_features.
tol: Maximum norm of the residual. If not None, overrides n_nonzero_coefs.
fit_intercept: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalize: If False, the regressors X are assumed to be already normalized.
precompute: Whether to use a precomputed Gram and Xy matrix to speed up calculations. Improves performance when n_targets or n_samples is very large. Note that if you already have such matrices, you can pass them directly to the fit method.

Read more in the User Guide.

Attributes

coef_: parameter vector (w in the formula)
intercept_: independent term in decision function.
n_iter_: Number of active features across every target.

Notes

Orthogonal matching pursuit was introduced in G. Mallat, Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, Vol. 41, No. 12. (December 1993), pp. 3397-3415. (http://blanche.polytechnique.fr/~mallat/papiers/MallatPursuit93.pdf)

This implementation is based on Rubinstein, R., Zibulevsky, M. and Elad, M., Efficient Implementation of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit Technical Report - CS Technion, April 2008. http://www.cs.technion.ac.il/~ronrubin/Publications/KSVD-OMP-v2.pdf

See also

orthogonal_mp orthogonal_mp_gram lars_path Lars LassoLars decomposition.sparse_encode

POSSIBLE NODE NAMES:
	OrthogonalMatchingPursuitRegressorSklearnNode OrthogonalMatchingPursuitRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.OutputCodeClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.OutputCodeClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

(Error-Correcting) Output-Code multiclass strategy

This node has been automatically generated by wrapping the sklearn.multiclass.OutputCodeClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Output-code based strategies consist in representing each class with a binary code (an array of 0s and 1s). At fitting time, one binary classifier per bit in the code book is fitted. At prediction time, the classifiers are used to project new points in the class space and the class closest to the points is chosen. The main advantage of these strategies is that the number of classifiers used can be controlled by the user, either for compressing the model (0 < code_size < 1) or for making the model more robust to errors (code_size > 1). See the documentation for more details.

Read more in the User Guide.

Parameters

estimator: An estimator object implementing fit and one of decision_function or predict_proba.
code_size: Percentage of the number of classes to be used to create the code book. A number between 0 and 1 will require fewer classifiers than one-vs-the-rest. A number greater than 1 will require more classifiers than one-vs-the-rest.
random_state: The generator used to initialize the codebook. Defaults to numpy.random.
n_jobs: The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.

Attributes

estimators_: Estimators used for predictions.
classes_: Array containing labels.
code_book_: Binary array containing the code of each class.

References

[1]	“Solving multiclass learning problems via error-correcting output codes”, Dietterich T., Bakiri G., Journal of Artificial Intelligence Research 2, 1995.

[2]	“The error coding method and PICTs”, James G., Hastie T., Journal of Computational and Graphical statistics 7, 1998.

[3]	“The Elements of Statistical Learning”, Hastie T., Tibshirani R., Friedman J., page 606 (second-edition) 2008.

POSSIBLE NODE NAMES:
	OutputCodeClassifierSklearnNode OutputCodeClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.PCATransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.PCATransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Principal component analysis (PCA)

This node has been automatically generated by wrapping the sklearn.decomposition.pca.PCA class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Linear dimensionality reduction using Singular Value Decomposition of the data and keeping only the most significant singular vectors to project the data to a lower dimensional space.

This implementation uses the scipy.linalg implementation of the singular value decomposition. It only works for dense arrays and is not scalable to large dimensional data.

The time complexity of this implementation is O(n ** 3) assuming n ~ n_samples ~ n_features.

Read more in the User Guide.

Parameters

n_components

:int, None or string

Number of components to keep. if n_components is not set all components are kept:

n_components == min(n_samples, n_features)

if n_components == ‘mle’, Minka’s MLE is used to guess the dimension if 0 < n_components < 1, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components

copy

:boolIf False, data passed to fit are overwritten and running fit(X).transform(X) will not yield the expected results, use fit_transform(X) instead.

whiten

:bool, optional

When True (False by default) the components_ vectors are divided by n_samples times singular values to ensure uncorrelated outputs with unit component-wise variances.

Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making there data respect some hard-wired assumptions.

Attributes

components_: Principal axes in feature space, representing the directions of maximum variance in the data.
explained_variance_ratio_: Percentage of variance explained by each of the selected components. If n_components is not set then all components are stored and the sum of explained variances is equal to 1.0
mean_: Per-feature empirical mean, estimated from the training set.
n_components_: The estimated number of components. Relevant when n_components is set to ‘mle’ or a number between 0 and 1 to select using explained variance.
noise_variance_: The estimated noise covariance following the Probabilistic PCA model from Tipping and Bishop 1999. See “Pattern Recognition and Machine Learning” by C. Bishop, 12.2.1 p. 574 or http://www.miketipping.com/papers/met-mppca.pdf. It is required to computed the estimated data covariance and score samples.

Notes

For n_components=’mle’, this class uses the method of `Thomas P. Minka:

Automatic Choice of Dimensionality for PCA. NIPS 2000: 598-604`

Implements the probabilistic PCA model from:

M. Tipping and C. Bishop, Probabilistic Principal Component Analysis, Journal of the Royal Statistical Society, Series B, 61, Part 3, pp. 611-622 via the score and score_samples methods. See http://www.miketipping.com/papers/met-mppca.pdf

Due to implementation subtleties of the Singular Value Decomposition (SVD), which is used in this implementation, running fit twice on the same matrix can lead to principal components with signs flipped (change in direction). For this reason, it is important to always use the same estimator object to transform data in a consistent fashion.

Examples

>>> import numpy as np
>>> from sklearn.decomposition import PCA
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> pca = PCA(n_components=2)
>>> pca.fit(X)
PCA(copy=True, n_components=2, whiten=False)
>>> print(pca.explained_variance_ratio_) 
[ 0.99244...  0.00755...]

See also

RandomizedPCA KernelPCA SparsePCA TruncatedSVD

POSSIBLE NODE NAMES:
	PCATransformerSklearn PCATransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.PassiveAggressiveClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.PassiveAggressiveClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Passive Aggressive Classifier

This node has been automatically generated by wrapping the sklearn.linear_model.passive_aggressive.PassiveAggressiveClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

C

:floatMaximum step size (regularization). Defaults to 1.0.

fit_intercept

:bool, default=FalseWhether the intercept should be estimated or not. If False, the data is assumed to be already centered.

n_iter

:int, optionalThe number of passes over the training data (aka epochs). Defaults to 5.

shuffle

:bool, default=TrueWhether or not the training data should be shuffled after each epoch.

random_state

:int seed, RandomState instance, or None (default)The seed of the pseudo random number generator to use when shuffling the data.

verbose

:integer, optionalThe verbosity level

n_jobs

:integer, optionalThe number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. -1 means ‘all CPUs’. Defaults to 1.

loss

:string, optional

The loss function to be used:

hinge: equivalent to PA-I in the reference paper.
squared_hinge: equivalent to PA-II in the reference paper.

warm_start

:bool, optionalWhen set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

class_weight

:dict, {class_label: weight} or “balanced” or None, optional

Preset for the class_weight fit parameter.

Weights associated with classes. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

New in version 0.17: parameter class_weight to automatically weight samples.

Attributes

coef_: Weights assigned to the features.
intercept_: Constants in decision function.

See also

SGDClassifier Perceptron

References

Online Passive-Aggressive Algorithms <http://jmlr.csail.mit.edu/papers/volume7/crammer06a/crammer06a.pdf> K. Crammer, O. Dekel, J. Keshat, S. Shalev-Shwartz, Y. Singer - JMLR (2006)

POSSIBLE NODE NAMES:
	PassiveAggressiveClassifierSklearn PassiveAggressiveClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.PassiveAggressiveRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.PassiveAggressiveRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Passive Aggressive Regressor

This node has been automatically generated by wrapping the sklearn.linear_model.passive_aggressive.PassiveAggressiveRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

C

:floatMaximum step size (regularization). Defaults to 1.0.

epsilon

:floatIf the difference between the current prediction and the correct label is below this threshold, the model is not updated.

fit_intercept

:boolWhether the intercept should be estimated or not. If False, the data is assumed to be already centered. Defaults to True.

n_iter

:int, optionalThe number of passes over the training data (aka epochs). Defaults to 5.

shuffle

:bool, default=TrueWhether or not the training data should be shuffled after each epoch.

random_state

:int seed, RandomState instance, or None (default)The seed of the pseudo random number generator to use when shuffling the data.

verbose

:integer, optionalThe verbosity level

loss

:string, optional

The loss function to be used:

epsilon_insensitive: equivalent to PA-I in the reference paper.
squared_epsilon_insensitive: equivalent to PA-II in the reference
paper.

warm_start

:bool, optionalWhen set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

Attributes

coef_: Weights assigned to the features.
intercept_: Constants in decision function.

See also

SGDRegressor

References

Online Passive-Aggressive Algorithms <http://jmlr.csail.mit.edu/papers/volume7/crammer06a/crammer06a.pdf> K. Crammer, O. Dekel, J. Keshat, S. Shalev-Shwartz, Y. Singer - JMLR (2006)

POSSIBLE NODE NAMES:
	PassiveAggressiveRegressorSklearnNode PassiveAggressiveRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.PatchExtractorTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.PatchExtractorTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Extracts patches from a collection of images

This node has been automatically generated by wrapping the sklearn.feature_extraction.image.PatchExtractor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

patch_size: the dimensions of one patch
max_patches: The maximum number of patches per image to extract. If max_patches is a float in (0, 1), it is taken to mean a proportion of the total number of patches.
random_state: Pseudo number generator state used for random sampling.

POSSIBLE NODE NAMES:
	PatchExtractorTransformerSklearn PatchExtractorTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.PerceptronClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.PerceptronClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Perceptron

This node has been automatically generated by wrapping the sklearn.linear_model.perceptron.Perceptron class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

penalty

:None, ‘l2’ or ‘l1’ or ‘elasticnet’The penalty (aka regularization term) to be used. Defaults to None.

alpha

:floatConstant that multiplies the regularization term if regularization is used. Defaults to 0.0001

fit_intercept

:boolWhether the intercept should be estimated or not. If False, the data is assumed to be already centered. Defaults to True.

n_iter

:int, optionalThe number of passes over the training data (aka epochs). Defaults to 5.

shuffle

:bool, optional, default TrueWhether or not the training data should be shuffled after each epoch.

random_state

:int seed, RandomState instance, or None (default)The seed of the pseudo random number generator to use when shuffling the data.

verbose

:integer, optionalThe verbosity level

n_jobs

:integer, optionalThe number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. -1 means ‘all CPUs’. Defaults to 1.

eta0

:doubleConstant by which the updates are multiplied. Defaults to 1.

class_weight

:dict, {class_label: weight} or “balanced” or None, optional

Preset for the class_weight fit parameter.

Weights associated with classes. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

warm_start

:bool, optionalWhen set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

Attributes

coef_: Weights assigned to the features.
intercept_: Constants in decision function.

Notes

Perceptron and SGDClassifier share the same underlying implementation. In fact, Perceptron() is equivalent to SGDClassifier(loss=”perceptron”, eta0=1, learning_rate=”constant”, penalty=None).

See also

SGDClassifier

References

http://en.wikipedia.org/wiki/Perceptron and references therein.

POSSIBLE NODE NAMES:
	PerceptronClassifierSklearn PerceptronClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.PolynomialFeaturesTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.PolynomialFeaturesTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Generate polynomial and interaction features.

This node has been automatically generated by wrapping the sklearn.preprocessing.data.PolynomialFeatures class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].

Parameters

degree: The degree of the polynomial features. Default = 2.
interaction_only: If true, only interaction features are produced: features that are products of at most degree distinct input features (so not x[1] ** 2, x[0] * x[2] ** 3, etc.).
include_bias: If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model).

Examples

>>> X = np.arange(6).reshape(3, 2)
>>> X
array([[0, 1],
       [2, 3],
       [4, 5]])
>>> poly = PolynomialFeatures(2)
>>> poly.fit_transform(X)
array([[  1.,   0.,   1.,   0.,   0.,   1.],
       [  1.,   2.,   3.,   4.,   6.,   9.],
       [  1.,   4.,   5.,  16.,  20.,  25.]])
>>> poly = PolynomialFeatures(interaction_only=True)
>>> poly.fit_transform(X)
array([[  1.,   0.,   1.,   0.],
       [  1.,   2.,   3.,   6.],
       [  1.,   4.,   5.,  20.]])

Attributes

powers_: powers_[i, j] is the exponent of the jth input in the ith output.
n_input_features_: The total number of input features.
n_output_features_: The total number of polynomial output features. The number of output features is computed by iterating over all suitably sized combinations of input features.

Notes

Be aware that the number of features in the output array scales polynomially in the number of features of the input array, and exponentially in the degree. High degrees can cause overfitting.

See examples/linear_model/plot_polynomial_interpolation.py

POSSIBLE NODE NAMES:
	PolynomialFeaturesTransformerSklearnNode PolynomialFeaturesTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.ProjectedGradientNMFTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.ProjectedGradientNMFTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Non-Negative Matrix Factorization (NMF)

This node has been automatically generated by wrapping the sklearn.decomposition.nmf.ProjectedGradientNMF class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Find two non-negative matrices (W, H) whose product approximates the non- negative matrix X. This factorization can be used for example for dimensionality reduction, source separation or topic extraction.

The objective function is:

0.5 * ||X - WH||_Fro^2
+ alpha * l1_ratio * ||vec(W)||_1
+ alpha * l1_ratio * ||vec(H)||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2
+ 0.5 * alpha * (1 - l1_ratio) * ||H||_Fro^2

Where:

||A||_Fro^2 = \sum_{i,j} A_{ij}^2 (Frobenius norm)
||vec(A)||_1 = \sum_{i,j} abs(A_{ij}) (Elementwise L1 norm)

The objective function is minimized with an alternating minimization of W and H.

Read more in the User Guide.

Parameters

n_components

:int or NoneNumber of components, if n_components is not set all features are kept.

init

:‘random’ | ‘nndsvd’ | ‘nndsvda’ | ‘nndsvdar’ | ‘custom’

Method used to initialize the procedure. Default: ‘nndsvdar’ if n_components < n_features, otherwise random. Valid options:

‘random’: non-negative random matrices, scaled with:
- sqrt(X.mean() / n_components)
‘nndsvd’: Nonnegative Double Singular Value Decomposition (NNDSVD)

initialization (better for sparseness)
‘nndsvda’: NNDSVD with zeros filled with the average of X

(better when sparsity is not desired)
‘nndsvdar’: NNDSVD with zeros filled with small random values

(generally faster, less accurate alternative to NNDSVDa for when sparsity is not desired)
‘custom’: use custom matrices W and H

solver

:‘pg’ | ‘cd’

Numerical solver to use:

‘pg’ is a Projected Gradient solver (deprecated).
‘cd’ is a Coordinate Descent solver (recommended).

New in version 0.17: Coordinate Descent solver.

Changed in version 0.17: Deprecated Projected Gradient solver.

tol

:double, default: 1e-4Tolerance value used in stopping conditions.

max_iter

:integer, default: 200Number of iterations to compute.

random_state

:integer seed, RandomState instance, or None (default)Random number generator seed control.

alpha

:double, default: 0.

Constant that multiplies the regularization terms. Set it to zero to have no regularization.

New in version 0.17: alpha used in the Coordinate Descent solver.

l1_ratio

:double, default: 0.

The regularization mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an elementwise L2 penalty (aka Frobenius Norm). For l1_ratio = 1 it is an elementwise L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.

New in version 0.17: Regularization parameter l1_ratio used in the Coordinate Descent solver.

shuffle

:boolean, default: False

If true, randomize the order of coordinates in the CD solver.

New in version 0.17: shuffle parameter used in the Coordinate Descent solver.

nls_max_iter

:integer, default: 2000

Number of iterations in NLS subproblem. Used only in the deprecated ‘pg’ solver.

Changed in version 0.17: Deprecated Projected Gradient solver. Use Coordinate Descent solver instead.

sparseness

:‘data’ | ‘components’ | None, default: None

Where to enforce sparsity in the model. Used only in the deprecated ‘pg’ solver.

Changed in version 0.17: Deprecated Projected Gradient solver. Use Coordinate Descent solver instead.

beta

:double, default: 1

Degree of sparseness, if sparseness is not None. Larger values mean more sparseness. Used only in the deprecated ‘pg’ solver.

Changed in version 0.17: Deprecated Projected Gradient solver. Use Coordinate Descent solver instead.

eta

:double, default: 0.1

Degree of correctness to maintain, if sparsity is not None. Smaller values mean larger error. Used only in the deprecated ‘pg’ solver.

Changed in version 0.17: Deprecated Projected Gradient solver. Use Coordinate Descent solver instead.

Attributes

components_: Non-negative components of the data.
reconstruction_err_: Frobenius norm of the matrix difference between the training data and the reconstructed data from the fit produced by the model. || X - WH ||_2
n_iter_: Actual number of iterations.

Examples

>>> import numpy as np
>>> X = np.array([[1,1], [2, 1], [3, 1.2], [4, 1], [5, 0.8], [6, 1]])
>>> from sklearn.decomposition import NMF
>>> model = NMF(n_components=2, init='random', random_state=0)
>>> model.fit(X) 
NMF(alpha=0.0, beta=1, eta=0.1, init='random', l1_ratio=0.0, max_iter=200,
  n_components=2, nls_max_iter=2000, random_state=0, shuffle=False,
  solver='cd', sparseness=None, tol=0.0001, verbose=0)

>>> model.components_
array([[ 2.09783018,  0.30560234],
       [ 2.13443044,  2.13171694]])
>>> model.reconstruction_err_ 
0.00115993...

References

C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation, 19(2007), 2756-2779. http://www.csie.ntu.edu.tw/~cjlin/nmf/

Cichocki, Andrzej, and P. H. A. N. Anh-Huy. “Fast local algorithms for large scale nonnegative matrix and tensor factorizations.” IEICE transactions on fundamentals of electronics, communications and computer sciences 92.3: 708-721, 2009.

POSSIBLE NODE NAMES:
	ProjectedGradientNMFTransformerSklearnNode ProjectedGradientNMFTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.QuadraticDiscriminantAnalysisClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.QuadraticDiscriminantAnalysisClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Quadratic Discriminant Analysis

This node has been automatically generated by wrapping the sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A classifier with a quadratic decision boundary, generated by fitting class conditional densities to the data and using Bayes’ rule.

The model fits a Gaussian density to each class.

New in version 0.17: QuadraticDiscriminantAnalysis

Changed in version 0.17: Deprecated qda.QDA have been moved to QuadraticDiscriminantAnalysis.

Parameters

priors: Priors on classes
reg_param: Regularizes the covariance estimate as (1-reg_param)*Sigma + reg_param*np.eye(n_features)

Attributes

covariances_: Covariance matrices of each class.
means_: Class means.
priors_: Class priors (sum to 1).
rotations_: For each class k an array of shape [n_features, n_k], with n_k = min(n_features, number of elements in class k) It is the rotation of the Gaussian distribution, i.e. its principal axis.
scalings_: For each class k an array of shape [n_k]. It contains the scaling of the Gaussian distributions along its principal axes, i.e. the variance in the rotated coordinate system.
store_covariances: If True the covariance matrices are computed and stored in the self.covariances_ attribute.

New in version 0.17.
tol: Threshold used for rank estimation.

New in version 0.17.

Examples

>>> from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis
>>> import numpy as np
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> y = np.array([1, 1, 1, 2, 2, 2])
>>> clf = QuadraticDiscriminantAnalysis()
>>> clf.fit(X, y)
... 
QuadraticDiscriminantAnalysis(priors=None, reg_param=0.0,
                              store_covariances=False, tol=0.0001)
>>> print(clf.predict([[-0.8, -1]]))
[1]

See also

sklearn.discriminant_analysis.LinearDiscriminantAnalysis: Linear: Discriminant Analysis

POSSIBLE NODE NAMES:
	QuadraticDiscriminantAnalysisClassifierSklearn QuadraticDiscriminantAnalysisClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RANSACRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RANSACRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

RANSAC (RANdom SAmple Consensus) algorithm.

This node has been automatically generated by wrapping the sklearn.linear_model.ransac.RANSACRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

RANSAC is an iterative algorithm for the robust estimation of parameters from a subset of inliers from the complete data set. More information can be found in the general documentation of linear models.

A detailed description of the algorithm can be found in the documentation of the linear_model sub-package.

Read more in the User Guide.

Parameters

base_estimator

:object, optional

Base estimator object which implements the following methods:

fit(X, y): Fit model to given training data and target values.

score(X, y): Returns the mean accuracy on the given test data, which is used for the stop criterion defined by stop_score. Additionally, the score is used to decide which of two equally large consensus sets is chosen as the better one.

If base_estimator is None, then base_estimator=sklearn.linear_model.LinearRegression() is used for target values of dtype float.

Note that the current implementation only supports regression estimators.

min_samples

:int (>= 1) or float ([0, 1]), optionalMinimum number of samples chosen randomly from original data. Treated as an absolute number of samples for min_samples >= 1, treated as a relative number ceil(min_samples * X.shape[0]) for min_samples < 1. This is typically chosen as the minimal number of samples necessary to estimate the given base_estimator. By default a sklearn.linear_model.LinearRegression() estimator is assumed and min_samples is chosen as X.shape[1] + 1.

residual_threshold

:float, optionalMaximum residual for a data sample to be classified as an inlier. By default the threshold is chosen as the MAD (median absolute deviation) of the target values y.

is_data_valid

:callable, optionalThis function is called with the randomly selected data before the model is fitted to it: is_data_valid(X, y). If its return value is False the current randomly chosen sub-sample is skipped.

is_model_valid

:callable, optionalThis function is called with the estimated model and the randomly selected data: is_model_valid(model, X, y). If its return value is False the current randomly chosen sub-sample is skipped. Rejecting samples with this function is computationally costlier than with is_data_valid. is_model_valid should therefore only be used if the estimated model is needed for making the rejection decision.

max_trials

:int, optionalMaximum number of iterations for random sample selection.

stop_n_inliers

:int, optionalStop iteration if at least this number of inliers are found.

stop_score

:float, optionalStop iteration if score is greater equal than this threshold.

stop_probability

:float in range [0, 1], optional

RANSAC iteration stops if at least one outlier-free set of the training data is sampled in RANSAC. This requires to generate at least N samples (iterations):

N >= log(1 - probability) / log(1 - e**m)

where the probability (confidence) is typically set to high value such as 0.99 (the default) and e is the current fraction of inliers w.r.t. the total number of samples.

residual_metric

:callable, optional

Metric to reduce the dimensionality of the residuals to 1 for multi-dimensional target values y.shape[1] > 1. By default the sum of absolute differences is used:

lambda dy: np.sum(np.abs(dy), axis=1)

random_state

:integer or numpy.RandomState, optionalThe generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

Attributes

estimator_: Best fitted model (copy of the base_estimator object).
n_trials_: Number of random selection trials until one of the stop criteria is met. It is always <= max_trials.
inlier_mask_: Boolean mask of inliers classified as True.

References

[1]	http://en.wikipedia.org/wiki/RANSAC

[2]	http://www.cs.columbia.edu/~belhumeur/courses/compPhoto/ransac.pdf

[3]	http://www.bmva.org/bmvc/2009/Papers/Paper355/Paper355.pdf

POSSIBLE NODE NAMES:
	RANSACRegressorSklearn RANSACRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RBFSamplerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RBFSamplerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Approximates feature map of an RBF kernel by Monte Carlo approximation of its Fourier transform.

This node has been automatically generated by wrapping the sklearn.kernel_approximation.RBFSampler class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

It implements a variant of Random Kitchen Sinks.[1]

Read more in the User Guide.

Parameters

gamma: Parameter of RBF kernel: exp(-gamma * x^2)
n_components: Number of Monte Carlo samples per original feature. Equals the dimensionality of the computed feature space.
random_state: If int, random_state is the seed used by the random number generator; if RandomState instance, random_state is the random number generator.

Notes

See “Random Features for Large-Scale Kernel Machines” by A. Rahimi and Benjamin Recht.

[1] “Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning” by A. Rahimi and Benjamin Recht. (http://www.eecs.berkeley.edu/~brecht/papers/08.rah.rec.nips.pdf)

POSSIBLE NODE NAMES:
	RBFSamplerTransformerSklearnNode RBFSamplerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RFECVTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RFECVTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Feature ranking with recursive feature elimination and cross-validated selection of the best number of features.

This node has been automatically generated by wrapping the sklearn.feature_selection.rfe.RFECV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

estimator

:object

A supervised learning estimator with a fit method that updates a coef_ attribute that holds the fitted parameters. Important features must correspond to high absolute values in the coef_ array.

For instance, this is the case for most supervised learning algorithms such as Support Vector Classifiers and Generalized Linear Models from the svm and linear_model modules.

step

:int or float, optional (default=1)If greater than or equal to 1, then step corresponds to the (integer) number of features to remove at each iteration. If within (0.0, 1.0), then step corresponds to the percentage (rounded down) of features to remove at each iteration.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, if y is binary or multiclass, StratifiedKFold used. If the estimator is a classifier or if y is neither binary nor multiclass, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

scoring

:string, callable or None, optional, default: NoneA string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y).

estimator_params

:dictParameters for the external estimator. This attribute is deprecated as of version 0.16 and will be removed in 0.18. Use estimator initialisation or set_params method instead.

verbose

:int, default=0Controls verbosity of output.

Attributes

n_features_: The number of selected features with cross-validation.
support_: The mask of selected features.
ranking_: The feature ranking, such that ranking_[i] corresponds to the ranking position of the i-th feature. Selected (i.e., estimated best) features are assigned rank 1.
grid_scores_: The cross-validation scores such that grid_scores_[i] corresponds to the CV score of the i-th subset of features.
estimator_: The external estimator fit on the reduced dataset.

Notes

The size of grid_scores_ is equal to ceil((n_features - 1) / step) + 1, where step is the number of features removed at each iteration.

Examples

The following example shows how to retrieve the a-priori not known 5 informative features in the Friedman #1 dataset.

>>> from sklearn.datasets import make_friedman1
>>> from sklearn.feature_selection import RFECV
>>> from sklearn.svm import SVR
>>> X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)
>>> estimator = SVR(kernel="linear")
>>> selector = RFECV(estimator, step=1, cv=5)
>>> selector = selector.fit(X, y)
>>> selector.support_ 
array([ True,  True,  True,  True,  True,
        False, False, False, False, False], dtype=bool)
>>> selector.ranking_
array([1, 1, 1, 1, 1, 6, 4, 3, 2, 5])

References

[1]	Guyon, I., Weston, J., Barnhill, S., & Vapnik, V., “Gene selection for cancer classification using support vector machines”, Mach. Learn., 46(1-3), 389–422, 2002.

POSSIBLE NODE NAMES:
	RFECVTransformerSklearn RFECVTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RFETransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RFETransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Feature ranking with recursive feature elimination.

This node has been automatically generated by wrapping the sklearn.feature_selection.rfe.RFE class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Given an external estimator that assigns weights to features (e.g., the coefficients of a linear model), the goal of recursive feature elimination (RFE) is to select features by recursively considering smaller and smaller sets of features. First, the estimator is trained on the initial set of features and weights are assigned to each one of them. Then, features whose absolute weights are the smallest are pruned from the current set features. That procedure is recursively repeated on the pruned set until the desired number of features to select is eventually reached.

Read more in the User Guide.

Parameters

estimator

:object

A supervised learning estimator with a fit method that updates a coef_ attribute that holds the fitted parameters. Important features must correspond to high absolute values in the coef_ array.

For instance, this is the case for most supervised learning algorithms such as Support Vector Classifiers and Generalized Linear Models from the svm and linear_model modules.

n_features_to_select

:int or None (default=None)The number of features to select. If None, half of the features are selected.

step

:int or float, optional (default=1)If greater than or equal to 1, then step corresponds to the (integer) number of features to remove at each iteration. If within (0.0, 1.0), then step corresponds to the percentage (rounded down) of features to remove at each iteration.

estimator_params

:dictParameters for the external estimator. This attribute is deprecated as of version 0.16 and will be removed in 0.18. Use estimator initialisation or set_params method instead.

verbose

:int, default=0Controls verbosity of output.

Attributes

n_features_: The number of selected features.
support_: The mask of selected features.
ranking_: The feature ranking, such that ranking_[i] corresponds to the ranking position of the i-th feature. Selected (i.e., estimated best) features are assigned rank 1.
estimator_: The external estimator fit on the reduced dataset.

Examples

The following example shows how to retrieve the 5 right informative features in the Friedman #1 dataset.

>>> from sklearn.datasets import make_friedman1
>>> from sklearn.feature_selection import RFE
>>> from sklearn.svm import SVR
>>> X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)
>>> estimator = SVR(kernel="linear")
>>> selector = RFE(estimator, 5, step=1)
>>> selector = selector.fit(X, y)
>>> selector.support_ 
array([ True,  True,  True,  True,  True,
        False, False, False, False, False], dtype=bool)
>>> selector.ranking_
array([1, 1, 1, 1, 1, 6, 4, 3, 2, 5])

References

[1]	Guyon, I., Weston, J., Barnhill, S., & Vapnik, V., “Gene selection for cancer classification using support vector machines”, Mach. Learn., 46(1-3), 389–422, 2002.

POSSIBLE NODE NAMES:
	RFETransformerSklearnNode RFETransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RadiusNeighborsClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RadiusNeighborsClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Classifier implementing a vote among neighbors within a given radius

This node has been automatically generated by wrapping the sklearn.neighbors.classification.RadiusNeighborsClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

radius

:float, optional (default = 1.0)Range of parameter space to use by default for :meth`radius_neighbors` queries.

weights

:str or callable

weight function used in prediction. Possible values:

‘uniform’ : uniform weights. All points in each neighborhood are weighted equally.
‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away.
[callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights.

Uniform weights are used by default.

algorithm

:{‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, optional

Algorithm used to compute the nearest neighbors:

‘ball_tree’ will use BallTree
‘kd_tree’ will use KDtree
‘brute’ will use a brute-force search.
‘auto’ will attempt to decide the most appropriate algorithm based on the values passed to fit() method.

Note: fitting on sparse input will override the setting of this parameter, using brute force.

leaf_size

:int, optional (default = 30)Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem.

metric

:string or DistanceMetric object (default=’minkowski’)the distance metric to use for the tree. The default metric is minkowski, and with p=2 is equivalent to the standard Euclidean metric. See the documentation of the DistanceMetric class for a list of available metrics.

p

:integer, optional (default = 2)Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.

outlier_label

:int, optional (default = None)Label, which is given for outlier samples (samples with no neighbors on given radius). If set to None, ValueError is raised, when outlier is detected.

metric_params

:dict, optional (default = None)Additional keyword arguments for the metric function.

Examples

>>> X = [[0], [1], [2], [3]]
>>> y = [0, 0, 1, 1]
>>> from sklearn.neighbors import RadiusNeighborsClassifier
>>> neigh = RadiusNeighborsClassifier(radius=1.0)
>>> neigh.fit(X, y) 
RadiusNeighborsClassifier(...)
>>> print(neigh.predict([[1.5]]))
[0]

See also

KNeighborsClassifier RadiusNeighborsRegressor KNeighborsRegressor NearestNeighbors

Notes

See Nearest Neighbors in the online documentation for a discussion of the choice of algorithm and leaf_size.

http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

POSSIBLE NODE NAMES:
	RadiusNeighborsClassifierSklearnNode RadiusNeighborsClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RadiusNeighborsRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RadiusNeighborsRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Regression based on neighbors within a fixed radius.

This node has been automatically generated by wrapping the sklearn.neighbors.regression.RadiusNeighborsRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.

Read more in the User Guide.

Parameters

radius

:float, optional (default = 1.0)Range of parameter space to use by default for :meth`radius_neighbors` queries.

weights

:str or callable

weight function used in prediction. Possible values:

‘uniform’ : uniform weights. All points in each neighborhood are weighted equally.
‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away.
[callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights.

Uniform weights are used by default.

algorithm

:{‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, optional

Algorithm used to compute the nearest neighbors:

‘ball_tree’ will use BallTree
‘kd_tree’ will use KDtree
‘brute’ will use a brute-force search.
‘auto’ will attempt to decide the most appropriate algorithm based on the values passed to fit() method.

Note: fitting on sparse input will override the setting of this parameter, using brute force.

leaf_size

:int, optional (default = 30)Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem.

metric

:string or DistanceMetric object (default=’minkowski’)the distance metric to use for the tree. The default metric is minkowski, and with p=2 is equivalent to the standard Euclidean metric. See the documentation of the DistanceMetric class for a list of available metrics.

p

:integer, optional (default = 2)Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.

metric_params

:dict, optional (default = None)Additional keyword arguments for the metric function.

Examples

>>> X = [[0], [1], [2], [3]]
>>> y = [0, 0, 1, 1]
>>> from sklearn.neighbors import RadiusNeighborsRegressor
>>> neigh = RadiusNeighborsRegressor(radius=1.0)
>>> neigh.fit(X, y) 
RadiusNeighborsRegressor(...)
>>> print(neigh.predict([[1.5]]))
[ 0.5]

See also

NearestNeighbors KNeighborsRegressor KNeighborsClassifier RadiusNeighborsClassifier

Notes

See Nearest Neighbors in the online documentation for a discussion of the choice of algorithm and leaf_size.

http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

POSSIBLE NODE NAMES:
	RadiusNeighborsRegressorSklearn RadiusNeighborsRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RandomForestClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RandomForestClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

A random forest classifier.

This node has been automatically generated by wrapping the sklearn.ensemble.forest.RandomForestClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is always the same as the original input sample size but the samples are drawn with replacement if bootstrap=True (default).

Read more in the User Guide.

Parameters

n_estimators

:integer, optional (default=10)The number of trees in the forest.

criterion

:string, optional (default=”gini”)The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain. Note: this parameter is tree-specific.

max_features

:int, float, string or None, optional (default=”auto”)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.
If float, then max_features is a percentage and int(max_features * n_features) features are considered at each split.
If “auto”, then max_features=sqrt(n_features).
If “sqrt”, then max_features=sqrt(n_features) (same as “auto”).
If “log2”, then max_features=log2(n_features).
If None, then max_features=n_features.

Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Note: this parameter is tree-specific.

max_depth

:integer or None, optional (default=None)The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. Ignored if max_leaf_nodes is not None. Note: this parameter is tree-specific.

min_samples_split

:integer, optional (default=2)The minimum number of samples required to split an internal node. Note: this parameter is tree-specific.

min_samples_leaf

:integer, optional (default=1)The minimum number of samples in newly created leaves. A split is discarded if after the split, one of the leaves would contain less then min_samples_leaf samples. Note: this parameter is tree-specific.

min_weight_fraction_leaf

:float, optional (default=0.)The minimum weighted fraction of the input samples required to be at a leaf node. Note: this parameter is tree-specific.

max_leaf_nodes

:int or None, optional (default=None)Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. If not None then max_depth will be ignored. Note: this parameter is tree-specific.

bootstrap

:boolean, optional (default=True)Whether bootstrap samples are used when building trees.

oob_score

:boolWhether to use out-of-bag samples to estimate the generalization error.

n_jobs

:integer, optional (default=1)The number of jobs to run in parallel for both fit and predict. If -1, then the number of jobs is set to the number of cores.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

verbose

:int, optional (default=0)Controls the verbosity of the tree building process.

warm_start

:bool, optional (default=False)When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.

class_weight : dict, list of dicts, “balanced”, “balanced_subsample” or None, optional

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one. For multi-output problems, a list of dicts can be provided in the same order as the columns of y.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

The “balanced_subsample” mode is the same as “balanced” except that weights are computed based on the bootstrap sample for every tree grown.

For multi-output, the weights of each column of y will be multiplied.

Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

Attributes

estimators_: The collection of fitted sub-estimators.
classes_: The classes labels (single output problem), or a list of arrays of class labels (multi-output problem).
n_classes_: The number of classes (single output problem), or a list containing the number of classes for each output (multi-output problem).
n_features_: The number of features when fit is performed.
n_outputs_: The number of outputs when fit is performed.
feature_importances_: The feature importances (the higher, the more important the feature).
oob_score_: Score of the training dataset obtained using an out-of-bag estimate.
oob_decision_function_: Decision function computed with out-of-bag estimate on the training set. If n_estimators is small it might be possible that a data point was never left out during the bootstrap. In this case, oob_decision_function_ might contain NaN.

References

[1]	Breiman, “Random Forests”, Machine Learning, 45(1), 5-32, 2001.

See also

DecisionTreeClassifier, ExtraTreesClassifier

POSSIBLE NODE NAMES:
	RandomForestClassifierSklearn RandomForestClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RandomForestRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RandomForestRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

A random forest regressor.

This node has been automatically generated by wrapping the sklearn.ensemble.forest.RandomForestRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is always the same as the original input sample size but the samples are drawn with replacement if bootstrap=True (default).

Read more in the User Guide.

Parameters

n_estimators

:integer, optional (default=10)The number of trees in the forest.

criterion

:string, optional (default=”mse”)The function to measure the quality of a split. The only supported criterion is “mse” for the mean squared error. Note: this parameter is tree-specific.

max_features

:int, float, string or None, optional (default=”auto”)

The number of features to consider when looking for the best split:

If int, then consider max_features features at each split.
If float, then max_features is a percentage and int(max_features * n_features) features are considered at each split.
If “auto”, then max_features=n_features.
If “sqrt”, then max_features=sqrt(n_features).
If “log2”, then max_features=log2(n_features).
If None, then max_features=n_features.

Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Note: this parameter is tree-specific.

max_depth

:integer or None, optional (default=None)The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. Ignored if max_leaf_nodes is not None. Note: this parameter is tree-specific.

min_samples_split

:integer, optional (default=2)The minimum number of samples required to split an internal node. Note: this parameter is tree-specific.

min_samples_leaf

:integer, optional (default=1)The minimum number of samples in newly created leaves. A split is discarded if after the split, one of the leaves would contain less then min_samples_leaf samples. Note: this parameter is tree-specific.

min_weight_fraction_leaf

:float, optional (default=0.)The minimum weighted fraction of the input samples required to be at a leaf node. Note: this parameter is tree-specific.

max_leaf_nodes

:int or None, optional (default=None)Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. If not None then max_depth will be ignored. Note: this parameter is tree-specific.

bootstrap

:boolean, optional (default=True)Whether bootstrap samples are used when building trees.

oob_score

:boolwhether to use out-of-bag samples to estimate the generalization error.

n_jobs

:integer, optional (default=1)The number of jobs to run in parallel for both fit and predict. If -1, then the number of jobs is set to the number of cores.

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

verbose

:int, optional (default=0)Controls the verbosity of the tree building process.

warm_start

:bool, optional (default=False)When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.

Attributes

estimators_: The collection of fitted sub-estimators.
feature_importances_: The feature importances (the higher, the more important the feature).
n_features_: The number of features when fit is performed.
n_outputs_: The number of outputs when fit is performed.
oob_score_: Score of the training dataset obtained using an out-of-bag estimate.
oob_prediction_: Prediction computed with out-of-bag estimate on the training set.

References

[1]	Breiman, “Random Forests”, Machine Learning, 45(1), 5-32, 2001.

See also

DecisionTreeRegressor, ExtraTreesRegressor

POSSIBLE NODE NAMES:
	RandomForestRegressorSklearn RandomForestRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RandomTreesEmbeddingTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RandomTreesEmbeddingTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

An ensemble of totally random trees.

This node has been automatically generated by wrapping the sklearn.ensemble.forest.RandomTreesEmbedding class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

An unsupervised transformation of a dataset to a high-dimensional sparse representation. A datapoint is coded according to which leaf of each tree it is sorted into. Using a one-hot encoding of the leaves, this leads to a binary coding with as many ones as there are trees in the forest.

The dimensionality of the resulting representation is n_out <= n_estimators * max_leaf_nodes. If max_leaf_nodes == None, the number of leaf nodes is at most n_estimators * 2 ** max_depth.

Read more in the User Guide.

Parameters

n_estimators: Number of trees in the forest.
max_depth: The maximum depth of each tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. Ignored if max_leaf_nodes is not None.
min_samples_split: The minimum number of samples required to split an internal node.
min_samples_leaf: The minimum number of samples in newly created leaves. A split is discarded if after the split, one of the leaves would contain less then min_samples_leaf samples.
min_weight_fraction_leaf: The minimum weighted fraction of the input samples required to be at a leaf node.
max_leaf_nodes: Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. If not None then max_depth will be ignored.
sparse_output: Whether or not to return a sparse CSR matrix, as default behavior, or to return a dense array compatible with dense pipeline operators.
n_jobs: The number of jobs to run in parallel for both fit and predict. If -1, then the number of jobs is set to the number of cores.
random_state: If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
verbose: Controls the verbosity of the tree building process.
warm_start: When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.

Attributes

estimators_: The collection of fitted sub-estimators.

References

[1]	P. Geurts, D. Ernst., and L. Wehenkel, “Extremely randomized trees”, Machine Learning, 63(1), 3-42, 2006.

[2]	Moosmann, F. and Triggs, B. and Jurie, F. “Fast discriminative visual codebooks using randomized clustering forests” NIPS 2007

POSSIBLE NODE NAMES:
	RandomTreesEmbeddingTransformerSklearn RandomTreesEmbeddingTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RandomizedLassoTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RandomizedLassoTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Randomized Lasso.

This node has been automatically generated by wrapping the sklearn.linear_model.randomized_l1.RandomizedLasso class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Randomized Lasso works by resampling the train data and computing a Lasso on each resampling. In short, the features selected more often are good features. It is also known as stability selection.

Read more in the User Guide.

Parameters

alpha

:float, ‘aic’, or ‘bic’, optionalThe regularization parameter alpha parameter in the Lasso. Warning: this is not the alpha parameter in the stability selection article which is scaling.

scaling

:float, optionalThe alpha parameter in the stability selection article used to randomly scale the features. Should be between 0 and 1.

sample_fraction

:float, optionalThe fraction of samples to be used in each randomized design. Should be between 0 and 1. If 1, all samples are used.

n_resampling

:int, optionalNumber of randomized models.

selection_threshold: float, optional

The score above which features should be selected.

fit_intercept

:boolean, optionalwhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

verbose

:boolean or integer, optionalSets the verbosity amount

normalize

:boolean, optional, default TrueIf True, the regressors X will be normalized before regression.

precompute

:True | False | ‘auto’Whether to use a precomputed Gram matrix to speed up calculations. If set to ‘auto’ let us decide. The Gram matrix can also be passed as argument.

max_iter

:integer, optionalMaximum number of iterations to perform in the Lars algorithm.

eps

:float, optionalThe machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. Unlike the ‘tol’ parameter in some iterative optimization-based algorithms, this parameter does not control the tolerance of the optimization.

n_jobs

:integer, optionalNumber of CPUs to use during the resampling. If ‘-1’, use all the CPUs

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

pre_dispatch

:int, or string, optional

Controls the number of jobs that get dispatched during parallel execution. Reducing this number can be useful to avoid an explosion of memory consumption when more jobs get dispatched than CPUs can process. This parameter can be:

None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs

An int, giving the exact number of total jobs that are spawned

A string, giving an expression as a function of n_jobs, as in ‘2*n_jobs’

memory

:Instance of joblib.Memory or stringUsed for internal caching. By default, no caching is done. If a string is given, it is the path to the caching directory.

Attributes

scores_: Feature scores between 0 and 1.
all_scores_: Feature scores between 0 and 1 for all values of the regularization parameter. The reference article suggests scores_ is the max of all_scores_.

Examples

>>> from sklearn.linear_model import RandomizedLasso
>>> randomized_lasso = RandomizedLasso()

Notes

See examples/linear_model/plot_sparse_recovery.py for an example.

References

Stability selection Nicolai Meinshausen, Peter Buhlmann Journal of the Royal Statistical Society: Series B Volume 72, Issue 4, pages 417-473, September 2010 DOI: 10.1111/j.1467-9868.2010.00740.x

See also

RandomizedLogisticRegression, LogisticRegression

POSSIBLE NODE NAMES:
	RandomizedLassoTransformerSklearn RandomizedLassoTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RandomizedLogisticRegressionTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RandomizedLogisticRegressionTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Randomized Logistic Regression

This node has been automatically generated by wrapping the sklearn.linear_model.randomized_l1.RandomizedLogisticRegression class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Randomized Regression works by resampling the train data and computing a LogisticRegression on each resampling. In short, the features selected more often are good features. It is also known as stability selection.

Read more in the User Guide.

Parameters

C

:float, optional, default=1The regularization parameter C in the LogisticRegression.

scaling

:float, optional, default=0.5The alpha parameter in the stability selection article used to randomly scale the features. Should be between 0 and 1.

sample_fraction

:float, optional, default=0.75The fraction of samples to be used in each randomized design. Should be between 0 and 1. If 1, all samples are used.

n_resampling

:int, optional, default=200Number of randomized models.

selection_threshold

:float, optional, default=0.25The score above which features should be selected.

fit_intercept

:boolean, optional, default=Truewhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

verbose

:boolean or integer, optionalSets the verbosity amount

normalize

:boolean, optional, default=TrueIf True, the regressors X will be normalized before regression.

tol

:float, optional, default=1e-3tolerance for stopping criteria of LogisticRegression

n_jobs

:integer, optionalNumber of CPUs to use during the resampling. If ‘-1’, use all the CPUs

random_state

:int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

pre_dispatch

:int, or string, optional

Controls the number of jobs that get dispatched during parallel execution. Reducing this number can be useful to avoid an explosion of memory consumption when more jobs get dispatched than CPUs can process. This parameter can be:

None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs

An int, giving the exact number of total jobs that are spawned

A string, giving an expression as a function of n_jobs, as in ‘2*n_jobs’

memory

:Instance of joblib.Memory or stringUsed for internal caching. By default, no caching is done. If a string is given, it is the path to the caching directory.

Attributes

scores_: Feature scores between 0 and 1.
all_scores_: Feature scores between 0 and 1 for all values of the regularization parameter. The reference article suggests scores_ is the max of all_scores_.

Examples

>>> from sklearn.linear_model import RandomizedLogisticRegression
>>> randomized_logistic = RandomizedLogisticRegression()

Notes

See examples/linear_model/plot_sparse_recovery.py for an example.

References

Stability selection Nicolai Meinshausen, Peter Buhlmann Journal of the Royal Statistical Society: Series B Volume 72, Issue 4, pages 417-473, September 2010 DOI: 10.1111/j.1467-9868.2010.00740.x

See also

RandomizedLasso, Lasso, ElasticNet

POSSIBLE NODE NAMES:
	RandomizedLogisticRegressionTransformerSklearnNode RandomizedLogisticRegressionTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RandomizedPCATransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RandomizedPCATransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Principal component analysis (PCA) using randomized SVD

This node has been automatically generated by wrapping the sklearn.decomposition.pca.RandomizedPCA class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Linear dimensionality reduction using approximated Singular Value Decomposition of the data and keeping only the most significant singular vectors to project the data to a lower dimensional space.

Read more in the User Guide.

Parameters

n_components

:int, optionalMaximum number of components to keep. When not given or None, this is set to n_features (the second dimension of the training data).

copy

:boolIf False, data passed to fit are overwritten and running fit(X).transform(X) will not yield the expected results, use fit_transform(X) instead.

iterated_power

:int, optionalNumber of iterations for the power method. 3 by default.

whiten

:bool, optional

When True (False by default) the components_ vectors are divided by the singular values to ensure uncorrelated outputs with unit component-wise variances.

Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making their data respect some hard-wired assumptions.

random_state

:int or RandomState instance or None (default)Pseudo Random Number generator seed control. If None, use the numpy.random singleton.

Attributes

components_: Components with maximum variance.
explained_variance_ratio_: Percentage of variance explained by each of the selected components. k is not set then all components are stored and the sum of explained variances is equal to 1.0
mean_: Per-feature empirical mean, estimated from the training set.

Examples

>>> import numpy as np
>>> from sklearn.decomposition import RandomizedPCA
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> pca = RandomizedPCA(n_components=2)
>>> pca.fit(X)                 
RandomizedPCA(copy=True, iterated_power=3, n_components=2,
       random_state=None, whiten=False)
>>> print(pca.explained_variance_ratio_) 
[ 0.99244...  0.00755...]

See also

PCA TruncatedSVD

References

[Halko2009]

Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions Halko, et al., 2009 (arXiv:909)

[MRT]

A randomized algorithm for the decomposition of matrices Per-Gunnar Martinsson, Vladimir Rokhlin and Mark Tygert

POSSIBLE NODE NAMES:
	RandomizedPCATransformerSklearnNode RandomizedPCATransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RandomizedSearchCVTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RandomizedSearchCVTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Randomized search on hyper parameters.

This node has been automatically generated by wrapping the sklearn.grid_search.RandomizedSearchCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

RandomizedSearchCV implements a “fit” and a “score” method. It also implements “predict”, “predict_proba”, “decision_function”, “transform” and “inverse_transform” if they are implemented in the estimator used.

The parameters of the estimator used to apply these methods are optimized by cross-validated search over parameter settings.

In contrast to GridSearchCV, not all parameter values are tried out, but rather a fixed number of parameter settings is sampled from the specified distributions. The number of parameter settings that are tried is given by n_iter.

If all parameters are presented as a list, sampling without replacement is performed. If at least one parameter is given as a distribution, sampling with replacement is used. It is highly recommended to use continuous distributions for continuous parameters.

Read more in the User Guide.

Parameters

estimator

:estimator object.A object of that type is instantiated for each grid point. This is assumed to implement the scikit-learn estimator interface. Either estimator needs to provide a score function, or scoring must be passed.

param_distributions

:dictDictionary with parameters names (string) as keys and distributions or lists of parameters to try. Distributions must provide a rvs method for sampling (such as those from scipy.stats.distributions). If a list is given, it is sampled uniformly.

n_iter

:int, default=10Number of parameter settings that are sampled. n_iter trades off runtime vs quality of the solution.

scoring

:string, callable or None, default=NoneA string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y). If None, the score method of the estimator is used.

fit_params

:dict, optionalParameters to pass to the fit method.

n_jobs

:int, default=1Number of jobs to run in parallel.

pre_dispatch

:int, or string, optional

Controls the number of jobs that get dispatched during parallel execution. Reducing this number can be useful to avoid an explosion of memory consumption when more jobs get dispatched than CPUs can process. This parameter can be:

None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs

An int, giving the exact number of total jobs that are spawned

A string, giving an expression as a function of n_jobs, as in ‘2*n_jobs’

iid

:boolean, default=TrueIf True, the data is assumed to be identically distributed across the folds, and the loss minimized is the total loss per sample, and not the mean loss across the folds.

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the default 3-fold cross-validation,
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, if y is binary or multiclass, StratifiedKFold used. If the estimator is a classifier or if y is neither binary nor multiclass, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

refit

:boolean, default=TrueRefit the best estimator with the entire dataset. If “False”, it is impossible to make predictions using this RandomizedSearchCV instance after fitting.

verbose

:integerControls the verbosity: the higher, the more messages.

random_state

:int or RandomStatePseudo random number generator state used for random uniform sampling from lists of possible values instead of scipy.stats distributions.

error_score

:‘raise’ (default) or numericValue to assign to the score if an error occurs in estimator fitting. If set to ‘raise’, the error is raised. If a numeric value is given, FitFailedWarning is raised. This parameter does not affect the refit step, which will always raise the error.

Attributes

grid_scores_

:list of named tuples

Contains scores for all parameter combinations in param_grid. Each entry corresponds to one parameter setting. Each named tuple has the attributes:

parameters, a dict of parameter settings

mean_validation_score, the mean score over the cross-validation folds

cv_validation_scores, the list of scores for each fold

best_estimator_

:estimatorEstimator that was chosen by the search, i.e. estimator which gave highest score (or smallest loss if specified) on the left out data. Not available if refit=False.

best_score_

:floatScore of best_estimator on the left out data.

best_params_

:dictParameter setting that gave the best results on the hold out data.

Notes

The parameters selected are those that maximize the score of the held-out data, according to the scoring parameter.

If n_jobs was set to a value higher than one, the data is copied for each parameter setting(and not n_jobs times). This is done for efficiency reasons if individual jobs take very little time, but may raise errors if the dataset is large and not enough memory is available. A workaround in this case is to set pre_dispatch. Then, the memory is copied only pre_dispatch many times. A reasonable value for pre_dispatch is 2 * n_jobs.

See Also

GridSearchCV:

Does exhaustive search over a grid of parameters.

ParameterSampler:

A generator over parameter settins, constructed from

param_distributions.

POSSIBLE NODE NAMES:
	RandomizedSearchCVTransformerSklearn RandomizedSearchCVTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RidgeCVRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RidgeCVRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Ridge regression with built-in cross-validation.

This node has been automatically generated by wrapping the sklearn.linear_model.ridge.RidgeCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

By default, it performs Generalized Cross-Validation, which is a form of efficient Leave-One-Out cross-validation.

Read more in the User Guide.

Parameters

alphas

:numpy array of shape [n_alphas]Array of alpha values to try. Small positive values of alpha improve the conditioning of the problem and reduce the variance of the estimates. Alpha corresponds to C^-1 in other linear models such as LogisticRegression or LinearSVC.

fit_intercept

:booleanWhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

scoring

:string, callable or None, optional, default: NoneA string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y).

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the efficient Leave-One-Out cross-validation
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

For integer/None inputs, if y is binary or multiclass, StratifiedKFold used, else, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

gcv_mode

:{None, ‘auto’, ‘svd’, eigen’}, optional

Flag indicating which strategy to use when performing Generalized Cross-Validation. Options are:

'auto' : use svd if n_samples > n_features or when X is a sparse
         matrix, otherwise use eigen
'svd' : force computation via singular value decomposition of X
        (does not work for sparse matrices)
'eigen' : force computation via eigendecomposition of X^T X

The ‘auto’ mode is the default and is intended to pick the cheaper option of the two depending upon the shape and format of the training data.

store_cv_values

:boolean, default=FalseFlag indicating if the cross-validation values corresponding to each alpha should be stored in the cv_values_ attribute (see below). This flag is only compatible with cv=None (i.e. using Generalized Cross-Validation).

Attributes

cv_values_: Cross-validation values for each alpha (if store_cv_values=True and cv=None). After fit() has been called, this attribute will contain the mean squared errors (by default) or the values of the {loss,score}_func function (if provided in the constructor).
coef_: Weight vector(s).
intercept_: Independent term in decision function. Set to 0.0 if fit_intercept = False.
alpha_: Estimated regularization parameter.

See also

Ridge: Ridge regression RidgeClassifier: Ridge classifier RidgeClassifierCV: Ridge classifier with built-in cross validation

POSSIBLE NODE NAMES:
	RidgeCVRegressorSklearnNode RidgeCVRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RidgeClassifierCVSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RidgeClassifierCVSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Ridge classifier with built-in cross-validation.

This node has been automatically generated by wrapping the sklearn.linear_model.ridge.RidgeClassifierCV class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

By default, it performs Generalized Cross-Validation, which is a form of efficient Leave-One-Out cross-validation. Currently, only the n_features > n_samples case is handled efficiently.

Read more in the User Guide.

Parameters

alphas

:numpy array of shape [n_alphas]Array of alpha values to try. Small positive values of alpha improve the conditioning of the problem and reduce the variance of the estimates. Alpha corresponds to C^-1 in other linear models such as LogisticRegression or LinearSVC.

fit_intercept

:booleanWhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

scoring

:string, callable or None, optional, default: NoneA string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y).

cv

:int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

None, to use the efficient Leave-One-Out cross-validation
integer, to specify the number of folds.
An object to be used as a cross-validation generator.
An iterable yielding train/test splits.

Refer User Guide for the various cross-validation strategies that can be used here.

class_weight

:dict or ‘balanced’, optional

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

Attributes

cv_values_: Cross-validation values for each alpha (if store_cv_values=True and

cv=None). After fit() has been called, this attribute will contain the mean squared errors (by default) or the values of the {loss,score}_func function (if provided in the constructor).

coef_: Weight vector(s).
intercept_: Independent term in decision function. Set to 0.0 if fit_intercept = False.
alpha_: Estimated regularization parameter

See also

Ridge: Ridge regression RidgeClassifier: Ridge classifier RidgeCV: Ridge regression with built-in cross validation

Notes

For multi-class classification, n_class classifiers are trained in a one-versus-all approach. Concretely, this is implemented by taking advantage of the multi-variate response support in Ridge.

POSSIBLE NODE NAMES:
	RidgeClassifierCVSklearnNode RidgeClassifierCVSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RidgeClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RidgeClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Classifier using Ridge regression.

This node has been automatically generated by wrapping the sklearn.linear_model.ridge.RidgeClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

alpha

:floatSmall positive values of alpha improve the conditioning of the problem and reduce the variance of the estimates. Alpha corresponds to C^-1 in other linear models such as LogisticRegression or LinearSVC.

class_weight

:dict or ‘balanced’, optional

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

copy_X

:boolean, optional, default TrueIf True, X will be copied; else, it may be overwritten.

fit_intercept

:booleanWhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

max_iter

:int, optionalMaximum number of iterations for conjugate gradient solver. The default value is determined by scipy.sparse.linalg.

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

solver

:{‘auto’, ‘svd’, ‘cholesky’, ‘lsqr’, ‘sparse_cg’, ‘sag’}

Solver to use in the computational routines:

‘auto’ chooses the solver automatically based on the type of data.
‘svd’ uses a Singular Value Decomposition of X to compute the Ridge coefficients. More stable for singular matrices than ‘cholesky’.
‘cholesky’ uses the standard scipy.linalg.solve function to obtain a closed-form solution.
‘sparse_cg’ uses the conjugate gradient solver as found in scipy.sparse.linalg.cg. As an iterative algorithm, this solver is more appropriate than ‘cholesky’ for large-scale data (possibility to set tol and max_iter).
‘lsqr’ uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fatest but may not be available in old scipy versions. It also uses an iterative procedure.
‘sag’ uses a Stochastic Average Gradient descent. It also uses an iterative procedure, and is faster than other solvers when both n_samples and n_features are large.

New in version 0.17: Stochastic Average Gradient descent solver.

tol

:floatPrecision of the solution.

random_state

:int seed, RandomState instance, or None (default)The seed of the pseudo random number generator to use when shuffling the data. Used in ‘sag’ solver.

Attributes

coef_: Weight vector(s).
intercept_: Independent term in decision function. Set to 0.0 if fit_intercept = False.
n_iter_: Actual number of iterations for each target. Available only for sag and lsqr solvers. Other solvers will return None.

See also

Ridge, RidgeClassifierCV

Notes

For multi-class classification, n_class classifiers are trained in a one-versus-all approach. Concretely, this is implemented by taking advantage of the multi-variate response support in Ridge.

POSSIBLE NODE NAMES:
	RidgeClassifierSklearn RidgeClassifierSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RidgeRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RidgeRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Linear least squares with l2 regularization.

This node has been automatically generated by wrapping the sklearn.linear_model.ridge.Ridge class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This model solves a regression model where the loss function is the linear least squares function and regularization is given by the l2-norm. Also known as Ridge Regression or Tikhonov regularization. This estimator has built-in support for multi-variate regression (i.e., when y is a 2d-array of shape [n_samples, n_targets]).

Read more in the User Guide.

Parameters

alpha

:{float, array-like}, shape (n_targets)Small positive values of alpha improve the conditioning of the problem and reduce the variance of the estimates. Alpha corresponds to C^-1 in other linear models such as LogisticRegression or LinearSVC. If an array is passed, penalties are assumed to be specific to the targets. Hence they must correspond in number.

copy_X

:boolean, optional, default TrueIf True, X will be copied; else, it may be overwritten.

fit_intercept

:booleanWhether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

max_iter

:int, optionalMaximum number of iterations for conjugate gradient solver. For ‘sparse_cg’ and ‘lsqr’ solvers, the default value is determined by scipy.sparse.linalg. For ‘sag’ solver, the default value is 1000.

normalize

:boolean, optional, default FalseIf True, the regressors X will be normalized before regression.

solver

:{‘auto’, ‘svd’, ‘cholesky’, ‘lsqr’, ‘sparse_cg’, ‘sag’}

Solver to use in the computational routines:

‘auto’ chooses the solver automatically based on the type of data.
‘svd’ uses a Singular Value Decomposition of X to compute the Ridge coefficients. More stable for singular matrices than ‘cholesky’.
‘cholesky’ uses the standard scipy.linalg.solve function to obtain a closed-form solution.
‘sparse_cg’ uses the conjugate gradient solver as found in scipy.sparse.linalg.cg. As an iterative algorithm, this solver is more appropriate than ‘cholesky’ for large-scale data (possibility to set tol and max_iter).
‘lsqr’ uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fatest but may not be available in old scipy versions. It also uses an iterative procedure.
‘sag’ uses a Stochastic Average Gradient descent. It also uses an iterative procedure, and is often faster than other solvers when both n_samples and n_features are large. Note that ‘sag’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

All last four solvers support both dense and sparse data. However, only ‘sag’ supports sparse input when fit_intercept is True.

New in version 0.17: Stochastic Average Gradient descent solver.

tol

:floatPrecision of the solution.

random_state

:int seed, RandomState instance, or None (default)

The seed of the pseudo random number generator to use when shuffling the data. Used in ‘sag’ solver.

New in version 0.17: random_state to support Stochastic Average Gradient.

Attributes

coef_: Weight vector(s).
intercept_: Independent term in decision function. Set to 0.0 if fit_intercept = False.
n_iter_: Actual number of iterations for each target. Available only for sag and lsqr solvers. Other solvers will return None.

See also

RidgeClassifier, RidgeCV, KernelRidge

Examples

>>> from sklearn.linear_model import Ridge
>>> import numpy as np
>>> n_samples, n_features = 10, 5
>>> np.random.seed(0)
>>> y = np.random.randn(n_samples)
>>> X = np.random.randn(n_samples, n_features)
>>> clf = Ridge(alpha=1.0)
>>> clf.fit(X, y) 
Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None,
      normalize=False, random_state=None, solver='auto', tol=0.001)

POSSIBLE NODE NAMES:
	RidgeRegressorSklearnNode RidgeRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.RobustScalerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.RobustScalerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Scale features using statistics that are robust to outliers.

This node has been automatically generated by wrapping the sklearn.preprocessing.data.RobustScaler class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This Scaler removes the median and scales the data according to the Interquartile Range (IQR). The IQR is the range between the 1st quartile (25th quantile) and the 3rd quartile (75th quantile).

Centering and scaling happen independently on each feature (or each sample, depending on the axis argument) by computing the relevant statistics on the samples in the training set. Median and interquartile range are then stored to be used on later data using the transform method.

Standardization of a dataset is a common requirement for many machine learning estimators. Typically this is done by removing the mean and scaling to unit variance. However, outliers can often influence the sample mean / variance in a negative way. In such cases, the median and the interquartile range often give better results.

New in version 0.17.

Read more in the User Guide.

Parameters

with_centering: If True, center the data before scaling. This does not work (and will raise an exception) when attempted on sparse matrices, because centering them entails building a dense matrix which in common use cases is likely to be too large to fit in memory.
with_scaling: If True, scale the data to interquartile range.
copy: If False, try to avoid a copy and do inplace scaling instead. This is not guaranteed to always work inplace; e.g. if the data is not a NumPy array or scipy.sparse CSR matrix, a copy may still be returned.

Attributes

center_: The median value for each feature in the training set.
scale_: The (scaled) interquartile range for each feature in the training set.

New in version 0.17: scale_ attribute.

See also

sklearn.preprocessing.StandardScaler to perform centering and scaling using mean and variance.

sklearn.decomposition.RandomizedPCA with whiten=True to further remove the linear correlation across features.

Notes

See examples/preprocessing/plot_robust_scaling.py for an example.

http://en.wikipedia.org/wiki/Median_(statistics) http://en.wikipedia.org/wiki/Interquartile_range

POSSIBLE NODE NAMES:
	RobustScalerTransformerSklearn RobustScalerTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SGDClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SGDClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Linear classifiers (SVM, logistic regression, a.o.) with SGD training.

This node has been automatically generated by wrapping the sklearn.linear_model.stochastic_gradient.SGDClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This estimator implements regularized linear models with stochastic gradient descent (SGD) learning: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate). SGD allows minibatch (online/out-of-core) learning, see the partial_fit method. For best results using the default learning rate schedule, the data should have zero mean and unit variance.

This implementation works with data represented as dense or sparse arrays of floating point values for the features. The model it fits can be controlled with the loss parameter; by default, it fits a linear support vector machine (SVM).

The regularizer is a penalty added to the loss function that shrinks model parameters towards the zero vector using either the squared euclidean norm L2 or the absolute norm L1 or a combination of both (Elastic Net). If the parameter update crosses the 0.0 value because of the regularizer, the update is truncated to 0.0 to allow for learning sparse models and achieve online feature selection.

Read more in the User Guide.

Parameters

loss

:str, ‘hinge’, ‘log’, ‘modified_huber’, ‘squared_hinge’, ‘perceptron’, or a regression loss: ‘squared_loss’, ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’The loss function to be used. Defaults to ‘hinge’, which gives a linear SVM. The ‘log’ loss gives logistic regression, a probabilistic classifier. ‘modified_huber’ is another smooth loss that brings tolerance to outliers as well as probability estimates. ‘squared_hinge’ is like hinge but is quadratically penalized. ‘perceptron’ is the linear loss used by the perceptron algorithm. The other losses are designed for regression but can be useful in classification as well; see SGDRegressor for a description.

penalty

:str, ‘none’, ‘l2’, ‘l1’, or ‘elasticnet’The penalty (aka regularization term) to be used. Defaults to ‘l2’ which is the standard regularizer for linear SVM models. ‘l1’ and ‘elasticnet’ might bring sparsity to the model (feature selection) not achievable with ‘l2’.

alpha

:floatConstant that multiplies the regularization term. Defaults to 0.0001 Also used to compute learning_rate when set to ‘optimal’.

l1_ratio

:floatThe Elastic Net mixing parameter, with 0 <= l1_ratio <= 1. l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1. Defaults to 0.15.

fit_intercept

:boolWhether the intercept should be estimated or not. If False, the data is assumed to be already centered. Defaults to True.

n_iter

:int, optionalThe number of passes over the training data (aka epochs). The number of iterations is set to 1 if using partial_fit. Defaults to 5.

shuffle

:bool, optionalWhether or not the training data should be shuffled after each epoch. Defaults to True.

random_state

:int seed, RandomState instance, or None (default)The seed of the pseudo random number generator to use when shuffling the data.

verbose

:integer, optionalThe verbosity level

epsilon

:floatEpsilon in the epsilon-insensitive loss functions; only if loss is ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’. For ‘huber’, determines the threshold at which it becomes less important to get the prediction exactly right. For epsilon-insensitive, any differences between the current prediction and the correct label are ignored if they are less than this threshold.

n_jobs

:integer, optionalThe number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. -1 means ‘all CPUs’. Defaults to 1.

learning_rate

:string, optional

The learning rate schedule:

constant: eta = eta0
optimal: eta = 1.0 / (alpha * (t + t0)) [default]
invscaling: eta = eta0 / pow(t, power_t)
where t0 is chosen by a heuristic proposed by Leon Bottou.

eta0

:doubleThe initial learning rate for the ‘constant’ or ‘invscaling’ schedules. The default value is 0.0 as eta0 is not used by the default schedule ‘optimal’.

power_t

:doubleThe exponent for inverse scaling learning rate [default 0.5].

class_weight

:dict, {class_label: weight} or “balanced” or None, optional

Preset for the class_weight fit parameter.

Weights associated with classes. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

warm_start

:bool, optionalWhen set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

average

:bool or int, optionalWhen set to True, computes the averaged SGD weights and stores the result in the coef_ attribute. If set to an int greater than 1, averaging will begin once the total number of samples seen reaches average. So average=10 will begin averaging after seeing 10 samples.

Attributes

coef_: Weights assigned to the features.
intercept_: Constants in decision function.

Examples

>>> import numpy as np
>>> from sklearn import linear_model
>>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
>>> Y = np.array([1, 1, 2, 2])
>>> clf = linear_model.SGDClassifier()
>>> clf.fit(X, Y)
... 
SGDClassifier(alpha=0.0001, average=False, class_weight=None, epsilon=0.1,
        eta0=0.0, fit_intercept=True, l1_ratio=0.15,
        learning_rate='optimal', loss='hinge', n_iter=5, n_jobs=1,
        penalty='l2', power_t=0.5, random_state=None, shuffle=True,
        verbose=0, warm_start=False)
>>> print(clf.predict([[-0.8, -1]]))
[1]

See also

LinearSVC, LogisticRegression, Perceptron

POSSIBLE NODE NAMES:
	SGDClassifierSklearnNode SGDClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SGDRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SGDRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Linear model fitted by minimizing a regularized empirical loss with SGD

This node has been automatically generated by wrapping the sklearn.linear_model.stochastic_gradient.SGDRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

SGD stands for Stochastic Gradient Descent: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate).

The regularizer is a penalty added to the loss function that shrinks model parameters towards the zero vector using either the squared euclidean norm L2 or the absolute norm L1 or a combination of both (Elastic Net). If the parameter update crosses the 0.0 value because of the regularizer, the update is truncated to 0.0 to allow for learning sparse models and achieve online feature selection.

This implementation works with data represented as dense numpy arrays of floating point values for the features.

Read more in the User Guide.

Parameters

loss

:str, ‘squared_loss’, ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’The loss function to be used. Defaults to ‘squared_loss’ which refers to the ordinary least squares fit. ‘huber’ modifies ‘squared_loss’ to focus less on getting outliers correct by switching from squared to linear loss past a distance of epsilon. ‘epsilon_insensitive’ ignores errors less than epsilon and is linear past that; this is the loss function used in SVR. ‘squared_epsilon_insensitive’ is the same but becomes squared loss past a tolerance of epsilon.

penalty

:str, ‘none’, ‘l2’, ‘l1’, or ‘elasticnet’The penalty (aka regularization term) to be used. Defaults to ‘l2’ which is the standard regularizer for linear SVM models. ‘l1’ and ‘elasticnet’ might bring sparsity to the model (feature selection) not achievable with ‘l2’.

alpha

:floatConstant that multiplies the regularization term. Defaults to 0.0001 Also used to compute learning_rate when set to ‘optimal’.

l1_ratio

:floatThe Elastic Net mixing parameter, with 0 <= l1_ratio <= 1. l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1. Defaults to 0.15.

fit_intercept

:boolWhether the intercept should be estimated or not. If False, the data is assumed to be already centered. Defaults to True.

n_iter

:int, optionalThe number of passes over the training data (aka epochs). The number of iterations is set to 1 if using partial_fit. Defaults to 5.

shuffle

:bool, optionalWhether or not the training data should be shuffled after each epoch. Defaults to True.

random_state

:int seed, RandomState instance, or None (default)The seed of the pseudo random number generator to use when shuffling the data.

verbose

:integer, optionalThe verbosity level.

epsilon

:floatEpsilon in the epsilon-insensitive loss functions; only if loss is ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’. For ‘huber’, determines the threshold at which it becomes less important to get the prediction exactly right. For epsilon-insensitive, any differences between the current prediction and the correct label are ignored if they are less than this threshold.

learning_rate

:string, optional

The learning rate:

constant: eta = eta0
optimal: eta = 1.0/(alpha * t)
invscaling: eta = eta0 / pow(t, power_t) [default]

eta0

:double, optionalThe initial learning rate [default 0.01].

power_t

:double, optionalThe exponent for inverse scaling learning rate [default 0.25].

warm_start

:bool, optionalWhen set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

average

:bool or int, optionalWhen set to True, computes the averaged SGD weights and stores the result in the coef_ attribute. If set to an int greater than 1, averaging will begin once the total number of samples seen reaches average. So average=10 will begin averaging after seeing 10 samples.

Attributes

coef_: Weights assigned to the features.
intercept_: The intercept term.
average_coef_: Averaged weights assigned to the features.
average_intercept_: The averaged intercept term.

Examples

>>> import numpy as np
>>> from sklearn import linear_model
>>> n_samples, n_features = 10, 5
>>> np.random.seed(0)
>>> y = np.random.randn(n_samples)
>>> X = np.random.randn(n_samples, n_features)
>>> clf = linear_model.SGDRegressor()
>>> clf.fit(X, y)
... 
SGDRegressor(alpha=0.0001, average=False, epsilon=0.1, eta0=0.01,
             fit_intercept=True, l1_ratio=0.15, learning_rate='invscaling',
             loss='squared_loss', n_iter=5, penalty='l2', power_t=0.25,
             random_state=None, shuffle=True, verbose=0, warm_start=False)

See also

Ridge, ElasticNet, Lasso, SVR

POSSIBLE NODE NAMES:
	SGDRegressorSklearn SGDRegressorSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SVCClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SVCClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

C-Support Vector Classification.

This node has been automatically generated by wrapping the sklearn.svm.classes.SVC class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The implementation is based on libsvm. The fit time complexity is more than quadratic with the number of samples which makes it hard to scale to dataset with more than a couple of 10000 samples.

The multiclass support is handled according to a one-vs-one scheme.

For details on the precise mathematical formulation of the provided kernel functions and how gamma, coef0 and degree affect each other, see the corresponding section in the narrative documentation:

svm_kernels.

Read more in the User Guide.

Parameters

C: Penalty parameter C of the error term.
kernel: Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. If a callable is given it is used to pre-compute the kernel matrix from data matrices; that matrix should be an array of shape (n_samples, n_samples).
degree: Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
gamma: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. If gamma is ‘auto’ then 1/n_features will be used instead.
coef0: Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’.
probability: Whether to enable probability estimates. This must be enabled prior to calling fit, and will slow down that method.
shrinking: Whether to use the shrinking heuristic.
tol: Tolerance for stopping criterion.
cache_size: Specify the size of the kernel cache (in MB).
class_weight: Set the parameter C of class i to class_weight[i]*C for SVC. If not given, all classes are supposed to have weight one. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))
verbose: Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in libsvm that, if enabled, may not work properly in a multithreaded context.
max_iter: Hard limit on iterations within solver, or -1 for no limit.
decision_function_shape: Whether to return a one-vs-rest (‘ovr’) ecision function of shape (n_samples, n_classes) as all other classifiers, or the original one-vs-one (‘ovo’) decision function of libsvm which has shape (n_samples, n_classes * (n_classes - 1) / 2). The default of None will currently behave as ‘ovo’ for backward compatibility and raise a deprecation warning, but will change ‘ovr’ in 0.18.

New in version 0.17: decision_function_shape=’ovr’ is recommended.

Changed in version 0.17: Deprecated decision_function_shape=’ovo’ and None.
random_state: The seed of the pseudo random number generator to use when shuffling the data for probability estimation.

Attributes

support_

:array-like, shape = [n_SV]Indices of support vectors.

support_vectors_

:array-like, shape = [n_SV, n_features]Support vectors.

n_support_

:array-like, dtype=int32, shape = [n_class]Number of support vectors for each class.

dual_coef_

:array, shape = [n_class-1, n_SV]Coefficients of the support vector in the decision function. For multiclass, coefficient for all 1-vs-1 classifiers. The layout of the coefficients in the multiclass case is somewhat non-trivial. See the section about multi-class classification in the SVM section of the User Guide for details.

coef_

:array, shape = [n_class-1, n_features]

Weights assigned to the features (coefficients in the primal problem). This is only available in the case of a linear kernel.

coef_ is a readonly property derived from dual_coef_ and support_vectors_.

intercept_

:array, shape = [n_class * (n_class-1) / 2]Constants in decision function.

Examples

>>> import numpy as np
>>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
>>> y = np.array([1, 1, 2, 2])
>>> from sklearn.svm import SVC
>>> clf = SVC()
>>> clf.fit(X, y) 
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)
>>> print(clf.predict([[-0.8, -1]]))
[1]

See also

SVR: Support Vector Machine for Regression implemented using libsvm.
LinearSVC: Scalable Linear Support Vector Machine for classification implemented using liblinear. Check the See also section of LinearSVC for more comparison element.

POSSIBLE NODE NAMES:
	SVCClassifierSklearnNode SVCClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SVRRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SVRRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Epsilon-Support Vector Regression.

This node has been automatically generated by wrapping the sklearn.svm.classes.SVR class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The free parameters in the model are C and epsilon.

The implementation is based on libsvm.

Read more in the User Guide.

Parameters

C: Penalty parameter C of the error term.
epsilon: Epsilon in the epsilon-SVR model. It specifies the epsilon-tube within which no penalty is associated in the training loss function with points predicted within a distance epsilon from the actual value.
kernel: Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. If a callable is given it is used to precompute the kernel matrix.
degree: Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
gamma: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. If gamma is ‘auto’ then 1/n_features will be used instead.
coef0: Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’.
shrinking: Whether to use the shrinking heuristic.
tol: Tolerance for stopping criterion.
cache_size: Specify the size of the kernel cache (in MB).
verbose: Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in libsvm that, if enabled, may not work properly in a multithreaded context.
max_iter: Hard limit on iterations within solver, or -1 for no limit.

Attributes

support_

:array-like, shape = [n_SV]Indices of support vectors.

support_vectors_

:array-like, shape = [nSV, n_features]Support vectors.

dual_coef_

:array, shape = [1, n_SV]Coefficients of the support vector in the decision function.

coef_

:array, shape = [1, n_features]

Weights assigned to the features (coefficients in the primal problem). This is only available in the case of a linear kernel.

coef_ is readonly property derived from dual_coef_ and support_vectors_.

intercept_

:array, shape = [1]Constants in decision function.

Examples

>>> from sklearn.svm import SVR
>>> import numpy as np
>>> n_samples, n_features = 10, 5
>>> np.random.seed(0)
>>> y = np.random.randn(n_samples)
>>> X = np.random.randn(n_samples, n_features)
>>> clf = SVR(C=1.0, epsilon=0.2)
>>> clf.fit(X, y) 
SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.2, gamma='auto',
    kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

See also

NuSVR: Support Vector Machine for regression implemented using libsvm using a parameter to control the number of support vectors.
LinearSVR: Scalable Linear Support Vector Machine for regression implemented using liblinear.

POSSIBLE NODE NAMES:
	SVRRegressorSklearnNode SVRRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SelectFdrTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SelectFdrTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Filter: Select the p-values for an estimated false discovery rate

This node has been automatically generated by wrapping the sklearn.feature_selection.univariate_selection.SelectFdr class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This uses the Benjamini-Hochberg procedure. alpha is an upper bound on the expected false discovery rate.

Read more in the User Guide.

Parameters

score_func: Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues).
alpha: The highest uncorrected p-value for features to keep.

Attributes

scores_: Scores of features.
pvalues_: p-values of feature scores.

References

http://en.wikipedia.org/wiki/False_discovery_rate

See also

f_classif: ANOVA F-value between labe/feature for classification tasks. chi2: Chi-squared stats of non-negative features for classification tasks. f_regression: F-value between label/feature for regression tasks. SelectPercentile: Select features based on percentile of the highest scores. SelectKBest: Select features based on the k highest scores. SelectFpr: Select features based on a false positive rate test. SelectFwe: Select features based on family-wise error rate. GenericUnivariateSelect: Univariate feature selector with configurable mode.

POSSIBLE NODE NAMES:
	SelectFdrTransformerSklearn SelectFdrTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SelectFprTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SelectFprTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Filter: Select the pvalues below alpha based on a FPR test.

This node has been automatically generated by wrapping the sklearn.feature_selection.univariate_selection.SelectFpr class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

FPR test stands for False Positive Rate test. It controls the total amount of false detections.

Read more in the User Guide.

Parameters

score_func: Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues).
alpha: The highest p-value for features to be kept.

Attributes

scores_: Scores of features.
pvalues_: p-values of feature scores.

See also

f_classif: ANOVA F-value between labe/feature for classification tasks. chi2: Chi-squared stats of non-negative features for classification tasks. f_regression: F-value between label/feature for regression tasks. SelectPercentile: Select features based on percentile of the highest scores. SelectKBest: Select features based on the k highest scores. SelectFdr: Select features based on an estimated false discovery rate. SelectFwe: Select features based on family-wise error rate. GenericUnivariateSelect: Univariate feature selector with configurable mode.

POSSIBLE NODE NAMES:
	SelectFprTransformerSklearn SelectFprTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SelectFromModelTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SelectFromModelTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Meta-transformer for selecting features based on importance weights.

This node has been automatically generated by wrapping the sklearn.feature_selection.from_model.SelectFromModel class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

New in version 0.17.

Parameters

estimator: The base estimator from which the transformer is built. This can be both a fitted (if prefit is set to True) or a non-fitted estimator.
threshold: The threshold value to use for feature selection. Features whose importance is greater or equal are kept while the others are discarded. If “median” (resp. “mean”), then the threshold value is the median (resp. the mean) of the feature importances. A scaling factor (e.g., “1.25*mean”) may also be used. If None and if the estimator has a parameter penalty set to l1, either explicitly or implicity (e.g, Lasso), the threshold is used is 1e-5. Otherwise, “mean” is used by default.
prefit: Whether a prefit model is expected to be passed into the constructor directly or not. If True, transform must be called directly and SelectFromModel cannot be used with cross_val_score, GridSearchCV and similar utilities that clone the estimator. Otherwise train the model using fit and then transform to do feature selection.

Attributes

estimator_: an estimator: The base estimator from which the transformer is built. This is stored only when a non-fitted estimator is passed to the SelectFromModel, i.e when prefit is False.
threshold_: float: The threshold value used for feature selection.

POSSIBLE NODE NAMES:
	SelectFromModelTransformerSklearn SelectFromModelTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SelectFweTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SelectFweTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Filter: Select the p-values corresponding to Family-wise error rate

This node has been automatically generated by wrapping the sklearn.feature_selection.univariate_selection.SelectFwe class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

score_func: Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues).
alpha: The highest uncorrected p-value for features to keep.

Attributes

scores_: Scores of features.
pvalues_: p-values of feature scores.

See also

f_classif: ANOVA F-value between labe/feature for classification tasks. chi2: Chi-squared stats of non-negative features for classification tasks. f_regression: F-value between label/feature for regression tasks. SelectPercentile: Select features based on percentile of the highest scores. SelectKBest: Select features based on the k highest scores. SelectFpr: Select features based on a false positive rate test. SelectFdr: Select features based on an estimated false discovery rate. GenericUnivariateSelect: Univariate feature selector with configurable mode.

POSSIBLE NODE NAMES:
	SelectFweTransformerSklearnNode SelectFweTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SelectKBestTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SelectKBestTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Select features according to the k highest scores.

This node has been automatically generated by wrapping the sklearn.feature_selection.univariate_selection.SelectKBest class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

score_func: Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues).
k: Number of top features to select. The “all” option bypasses selection, for use in a parameter search.

Attributes

scores_: Scores of features.
pvalues_: p-values of feature scores.

Notes

Ties between features with equal scores will be broken in an unspecified way.

See also

f_classif: ANOVA F-value between labe/feature for classification tasks. chi2: Chi-squared stats of non-negative features for classification tasks. f_regression: F-value between label/feature for regression tasks. SelectPercentile: Select features based on percentile of the highest scores. SelectFpr: Select features based on a false positive rate test. SelectFdr: Select features based on an estimated false discovery rate. SelectFwe: Select features based on family-wise error rate. GenericUnivariateSelect: Univariate feature selector with configurable mode.

POSSIBLE NODE NAMES:
	SelectKBestTransformerSklearn SelectKBestTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SelectPercentileTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SelectPercentileTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Select features according to a percentile of the highest scores.

This node has been automatically generated by wrapping the sklearn.feature_selection.univariate_selection.SelectPercentile class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

score_func: Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues).
percentile: Percent of features to keep.

Attributes

scores_: Scores of features.
pvalues_: p-values of feature scores.

Notes

Ties between features with equal scores will be broken in an unspecified way.

See also

f_classif: ANOVA F-value between labe/feature for classification tasks. chi2: Chi-squared stats of non-negative features for classification tasks. f_regression: F-value between label/feature for regression tasks. SelectKBest: Select features based on the k highest scores. SelectFpr: Select features based on a false positive rate test. SelectFdr: Select features based on an estimated false discovery rate. SelectFwe: Select features based on family-wise error rate. GenericUnivariateSelect: Univariate feature selector with configurable mode.

POSSIBLE NODE NAMES:
	SelectPercentileTransformerSklearn SelectPercentileTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SkewedChi2SamplerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SkewedChi2SamplerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Approximates feature map of the “skewed chi-squared” kernel by Monte Carlo approximation of its Fourier transform.

This node has been automatically generated by wrapping the sklearn.kernel_approximation.SkewedChi2Sampler class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Read more in the User Guide.

Parameters

skewedness: “skewedness” parameter of the kernel. Needs to be cross-validated.
n_components: number of Monte Carlo samples per original feature. Equals the dimensionality of the computed feature space.
random_state: If int, random_state is the seed used by the random number generator; if RandomState instance, random_state is the random number generator.

References

See “Random Fourier Approximations for Skewed Multiplicative Histogram Kernels” by Fuxin Li, Catalin Ionescu and Cristian Sminchisescu.

See also

AdditiveChi2Sampler: variant of the chi squared kernel.

sklearn.metrics.pairwise.chi2_kernel : The exact chi squared kernel.

POSSIBLE NODE NAMES:
	SkewedChi2SamplerTransformerSklearnNode SkewedChi2SamplerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SparseCoderTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SparseCoderTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Sparse coding

This node has been automatically generated by wrapping the sklearn.decomposition.dict_learning.SparseCoder class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Finds a sparse representation of data against a fixed, precomputed dictionary.

Each row of the result is the solution to a sparse coding problem. The goal is to find a sparse array code such that:

X ~= code * dictionary

Read more in the User Guide.

Parameters

dictionary

:array, [n_components, n_features]The dictionary atoms used for sparse coding. Lines are assumed to be normalized to unit norm.

transform_algorithm

:{‘lasso_lars’, ‘lasso_cd’, ‘lars’, ‘omp’, ‘threshold’}

Algorithm used to transform the data:

lars: uses the least angle regression method (linear_model.lars_path)
lasso_lars: uses Lars to compute the Lasso solution
lasso_cd: uses the coordinate descent method to compute the
Lasso solution (linear_model.Lasso). lasso_lars will be faster if
the estimated components are sparse.
omp: uses orthogonal matching pursuit to estimate the sparse solution
threshold: squashes to zero all coefficients less than alpha from
the projection dictionary * X'

transform_n_nonzero_coefs

:int, 0.1 * n_features by defaultNumber of nonzero coefficients to target in each column of the solution. This is only used by algorithm=’lars’ and algorithm=’omp’ and is overridden by alpha in the omp case.

transform_alpha

:float, 1. by defaultIf algorithm=’lasso_lars’ or algorithm=’lasso_cd’, alpha is the penalty applied to the L1 norm. If algorithm=’threshold’, alpha is the absolute value of the threshold below which coefficients will be squashed to zero. If algorithm=’omp’, alpha is the tolerance parameter: the value of the reconstruction error targeted. In this case, it overrides n_nonzero_coefs.

split_sign

:bool, False by defaultWhether to split the sparse feature vector into the concatenation of its negative part and its positive part. This can improve the performance of downstream classifiers.

n_jobs

:int,number of parallel jobs to run

Attributes

components_: The unchanged dictionary atoms

See also

DictionaryLearning MiniBatchDictionaryLearning SparsePCA MiniBatchSparsePCA sparse_encode

POSSIBLE NODE NAMES:
	SparseCoderTransformerSklearnNode SparseCoderTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SparsePCATransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SparsePCATransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Sparse Principal Components Analysis (SparsePCA)

This node has been automatically generated by wrapping the sklearn.decomposition.sparse_pca.SparsePCA class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Finds the set of sparse components that can optimally reconstruct the data. The amount of sparseness is controllable by the coefficient of the L1 penalty, given by the parameter alpha.

Read more in the User Guide.

Parameters

n_components: Number of sparse atoms to extract.
alpha: Sparsity controlling parameter. Higher values lead to sparser components.
ridge_alpha: Amount of ridge shrinkage to apply in order to improve conditioning when calling the transform method.
max_iter: Maximum number of iterations to perform.
tol: Tolerance for the stopping condition.
method: lars: uses the least angle regression method to solve the lasso problem (linear_model.lars_path) cd: uses the coordinate descent method to compute the Lasso solution (linear_model.Lasso). Lars will be faster if the estimated components are sparse.
n_jobs: Number of parallel jobs to run.
U_init: Initial values for the loadings for warm restart scenarios.
V_init: Initial values for the components for warm restart scenarios.

verbose :

Degree of verbosity of the printed output.

random_state: Pseudo number generator state used for random sampling.

Attributes

components_: Sparse components extracted from the data.
error_: Vector of errors at each iteration.
n_iter_: Number of iterations run.

See also

PCA MiniBatchSparsePCA DictionaryLearning

POSSIBLE NODE NAMES:
	SparsePCATransformerSklearnNode SparsePCATransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.SparseRandomProjectionTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.SparseRandomProjectionTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Reduce dimensionality through sparse random projection

This node has been automatically generated by wrapping the sklearn.random_projection.SparseRandomProjection class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Sparse random matrix is an alternative to dense random projection matrix that guarantees similar embedding quality while being much more memory efficient and allowing faster computation of the projected data.

If we note s = 1 / density the components of the random matrix are drawn from:

-sqrt(s) / sqrt(n_components) with probability 1 / 2s

0 with probability 1 - 1 / s

+sqrt(s) / sqrt(n_components) with probability 1 / 2s

Read more in the User Guide.

Parameters

n_components

:int or ‘auto’, optional (default = ‘auto’)

Dimensionality of the target projection space.

n_components can be automatically adjusted according to the number of samples in the dataset and the bound given by the Johnson-Lindenstrauss lemma. In that case the quality of the embedding is controlled by the eps parameter.

It should be noted that Johnson-Lindenstrauss lemma can yield very conservative estimated of the required number of components as it makes no assumption on the structure of the dataset.

density

:float in range ]0, 1], optional (default=’auto’)

Ratio of non-zero component in the random projection matrix.

If density = ‘auto’, the value is set to the minimum density as recommended by Ping Li et al.: 1 / sqrt(n_features).

Use density = 1 / 3.0 if you want to reproduce the results from Achlioptas, 2001.

eps

:strictly positive float, optional, (default=0.1)

Parameter to control the quality of the embedding according to the Johnson-Lindenstrauss lemma when n_components is set to ‘auto’.

Smaller values lead to better embedding and higher number of dimensions (n_components) in the target projection space.

dense_output

:boolean, optional (default=False)

If True, ensure that the output of the random projection is a dense numpy array even if the input and random projection matrix are both sparse. In practice, if the number of components is small the number of zero components in the projected data will be very small and it will be more CPU and memory efficient to use a dense representation.

If False, the projected data uses a sparse representation if the input is sparse.

random_state

:integer, RandomState instance or None (default=None)Control the pseudo random number generator used to generate the matrix at fit time.

Attributes

n_component_: Concrete number of components computed when n_components=”auto”.
components_: Random matrix used for the projection.
density_: Concrete density computed from when density = “auto”.

See Also

GaussianRandomProjection

References

[1]	Ping Li, T. Hastie and K. W. Church, 2006, “Very Sparse Random Projections”. http://www.stanford.edu/~hastie/Papers/Ping/KDD06_rp.pdf

[2]	D. Achlioptas, 2001, “Database-friendly random projections”, http://www.cs.ucsc.edu/~optas/papers/jl.pdf

POSSIBLE NODE NAMES:
	SparseRandomProjectionTransformerSklearn SparseRandomProjectionTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.StandardScalerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.StandardScalerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Standardize features by removing the mean and scaling to unit variance

This node has been automatically generated by wrapping the sklearn.preprocessing.data.StandardScaler class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training set. Mean and standard deviation are then stored to be used on later data using the transform method.

Standardization of a dataset is a common requirement for many machine learning estimators: they might behave badly if the individual feature do not more or less look like standard normally distributed data (e.g. Gaussian with 0 mean and unit variance).

For instance many elements used in the objective function of a learning algorithm (such as the RBF kernel of Support Vector Machines or the L1 and L2 regularizers of linear models) assume that all features are centered around 0 and have variance in the same order. If a feature has a variance that is orders of magnitude larger that others, it might dominate the objective function and make the estimator unable to learn from other features correctly as expected.

This scaler can also be applied to sparse CSR or CSC matrices by passing with_mean=False to avoid breaking the sparsity structure of the data.

Read more in the User Guide.

Parameters

with_mean: If True, center the data before scaling. This does not work (and will raise an exception) when attempted on sparse matrices, because centering them entails building a dense matrix which in common use cases is likely to be too large to fit in memory.
with_std: If True, scale the data to unit variance (or equivalently, unit standard deviation).
copy: If False, try to avoid a copy and do inplace scaling instead. This is not guaranteed to always work inplace; e.g. if the data is not a NumPy array or scipy.sparse CSR matrix, a copy may still be returned.

Attributes

scale_: Per feature relative scaling of the data.

New in version 0.17: scale_ is recommended instead of deprecated std_.
mean_: The mean value for each feature in the training set.
var_: The variance for each feature in the training set. Used to compute scale_
n_samples_seen_: The number of samples processed by the estimator. Will be reset on new calls to fit, but increments across partial_fit calls.

See also

sklearn.preprocessing.scale() to perform centering and scaling without using the Transformer object oriented API

sklearn.decomposition.RandomizedPCA with whiten=True to further remove the linear correlation across features.

POSSIBLE NODE NAMES:
	StandardScalerTransformerSklearn StandardScalerTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.TfidfTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.TfidfTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Transform a count matrix to a normalized tf or tf-idf representation

This node has been automatically generated by wrapping the sklearn.feature_extraction.text.TfidfTransformer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Tf means term-frequency while tf-idf means term-frequency times inverse document-frequency. This is a common term weighting scheme in information retrieval, that has also found good use in document classification.

The goal of using tf-idf instead of the raw frequencies of occurrence of a token in a given document is to scale down the impact of tokens that occur very frequently in a given corpus and that are hence empirically less informative than features that occur in a small fraction of the training corpus.

The actual formula used for tf-idf is tf * (idf + 1) = tf + tf * idf, instead of tf * idf. The effect of this is that terms with zero idf, i.e. that occur in all documents of a training set, will not be entirely ignored. The formulas used to compute tf and idf depend on parameter settings that correspond to the SMART notation used in IR, as follows:

Tf is “n” (natural) by default, “l” (logarithmic) when sublinear_tf=True. Idf is “t” when use_idf is given, “n” (none) otherwise. Normalization is “c” (cosine) when norm=’l2’, “n” (none) when norm=None.

Read more in the User Guide.

Parameters

norm: Norm used to normalize term vectors. None for no normalization.
use_idf: Enable inverse-document-frequency reweighting.
smooth_idf: Smooth idf weights by adding one to document frequencies, as if an extra document was seen containing every term in the collection exactly once. Prevents zero divisions.
sublinear_tf: Apply sublinear tf scaling, i.e. replace tf with 1 + log(tf).

References

[Yates2011]

R. Baeza-Yates and B. Ribeiro-Neto (2011). Modern Information Retrieval. Addison Wesley, pp. 68-74.

[MRS2008]

C.D. Manning, P. Raghavan and H. Schuetze (2008). Introduction to Information Retrieval. Cambridge University Press, pp. 118-120.

POSSIBLE NODE NAMES:
	TfidfTransformerSklearnNode TfidfTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.TfidfVectorizerTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.TfidfVectorizerTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Convert a collection of raw documents to a matrix of TF-IDF features.

This node has been automatically generated by wrapping the sklearn.feature_extraction.text.TfidfVectorizer class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

Equivalent to CountVectorizer followed by TfidfTransformer.

Read more in the User Guide.

Parameters

input

:string {‘filename’, ‘file’, ‘content’}

If ‘filename’, the sequence passed as an argument to fit is expected to be a list of filenames that need reading to fetch the raw content to analyze.

If ‘file’, the sequence items must have a ‘read’ method (file-like object) that is called to fetch the bytes in memory.

Otherwise the input is expected to be the sequence strings or bytes items are expected to be analyzed directly.

encoding

:string, ‘utf-8’ by default.If bytes or files are given to analyze, this encoding is used to decode.

decode_error

:{‘strict’, ‘ignore’, ‘replace’}Instruction on what to do if a byte sequence is given to analyze that contains characters not of the given encoding. By default, it is ‘strict’, meaning that a UnicodeDecodeError will be raised. Other values are ‘ignore’ and ‘replace’.

strip_accents

:{‘ascii’, ‘unicode’, None}Remove accents during the preprocessing step. ‘ascii’ is a fast method that only works on characters that have an direct ASCII mapping. ‘unicode’ is a slightly slower method that works on any characters. None (default) does nothing.

analyzer

:string, {‘word’, ‘char’} or callable

Whether the feature should be made of word or character n-grams.

If a callable is passed it is used to extract the sequence of features out of the raw, unprocessed input.

preprocessor

:callable or None (default)Override the preprocessing (string transformation) stage while preserving the tokenizing and n-grams generation steps.

tokenizer

:callable or None (default)Override the string tokenization step while preserving the preprocessing and n-grams generation steps. Only applies if analyzer == 'word'.

ngram_range

:tuple (min_n, max_n)The lower and upper boundary of the range of n-values for different n-grams to be extracted. All values of n such that min_n <= n <= max_n will be used.

stop_words

:string {‘english’}, list, or None (default)

If a string, it is passed to _check_stop_list and the appropriate stop list is returned. ‘english’ is currently the only supported string value.

If a list, that list is assumed to contain stop words, all of which will be removed from the resulting tokens. Only applies if analyzer == 'word'.

If None, no stop words will be used. max_df can be set to a value in the range [0.7, 1.0) to automatically detect and filter stop words based on intra corpus document frequency of terms.

lowercase

:boolean, default TrueConvert all characters to lowercase before tokenizing.

token_pattern

:stringRegular expression denoting what constitutes a “token”, only used if analyzer == 'word'. The default regexp selects tokens of 2 or more alphanumeric characters (punctuation is completely ignored and always treated as a token separator).

max_df

:float in range [0.0, 1.0] or int, default=1.0When building the vocabulary ignore terms that have a document frequency strictly higher than the given threshold (corpus-specific stop words). If float, the parameter represents a proportion of documents, integer absolute counts. This parameter is ignored if vocabulary is not None.

min_df

:float in range [0.0, 1.0] or int, default=1When building the vocabulary ignore terms that have a document frequency strictly lower than the given threshold. This value is also called cut-off in the literature. If float, the parameter represents a proportion of documents, integer absolute counts. This parameter is ignored if vocabulary is not None.

max_features

:int or None, default=None

If not None, build a vocabulary that only consider the top max_features ordered by term frequency across the corpus.

This parameter is ignored if vocabulary is not None.

vocabulary

:Mapping or iterable, optionalEither a Mapping (e.g., a dict) where keys are terms and values are indices in the feature matrix, or an iterable over terms. If not given, a vocabulary is determined from the input documents.

binary

:boolean, default=FalseIf True, all non-zero term counts are set to 1. This does not mean outputs will have only 0/1 values, only that the tf term in tf-idf is binary. (Set idf and normalization to False to get 0/1 outputs.)

dtype

:type, optionalType of the matrix returned by fit_transform() or transform().

norm

:‘l1’, ‘l2’ or None, optionalNorm used to normalize term vectors. None for no normalization.

use_idf

:boolean, default=TrueEnable inverse-document-frequency reweighting.

smooth_idf

:boolean, default=TrueSmooth idf weights by adding one to document frequencies, as if an extra document was seen containing every term in the collection exactly once. Prevents zero divisions.

sublinear_tf

:boolean, default=FalseApply sublinear tf scaling, i.e. replace tf with 1 + log(tf).

Attributes

idf_

:array, shape = [n_features], or NoneThe learned idf vector (global term weights) when use_idf is set to True, None otherwise.

stop_words_

:set

Terms that were ignored because they either:

occurred in too many documents (max_df)

occurred in too few documents (min_df)

were cut off by feature selection (max_features).

This is only available if no vocabulary was given.

See also

CountVectorizer: Tokenize the documents and count the occurrences of token and return them as a sparse matrix
TfidfTransformer: Apply Term Frequency Inverse Document Frequency normalization to a sparse matrix of occurrence counts.

Notes

The stop_words_ attribute can get large and increase the model size when pickling. This attribute is provided only for introspection and can be safely removed using delattr or set to None before pickling.

POSSIBLE NODE NAMES:
	TfidfVectorizerTransformerSklearnNode TfidfVectorizerTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.TheilSenRegressorSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.TheilSenRegressorSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Theil-Sen Estimator: robust multivariate regression model.

This node has been automatically generated by wrapping the sklearn.linear_model.theil_sen.TheilSenRegressor class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

The algorithm calculates least square solutions on subsets with size n_subsamples of the samples in X. Any value of n_subsamples between the number of features and samples leads to an estimator with a compromise between robustness and efficiency. Since the number of least square solutions is “n_samples choose n_subsamples”, it can be extremely large and can therefore be limited with max_subpopulation. If this limit is reached, the subsets are chosen randomly. In a final step, the spatial median (or L1 median) is calculated of all least square solutions.

Read more in the User Guide.

Parameters

fit_intercept: Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations.
copy_X: If True, X will be copied; else, it may be overwritten.
max_subpopulation: Instead of computing with a set of cardinality ‘n choose k’, where n is the number of samples and k is the number of subsamples (at least number of features), consider only a stochastic subpopulation of a given maximal size if ‘n choose k’ is larger than max_subpopulation. For other than small problem sizes this parameter will determine memory usage and runtime if n_subsamples is not changed.
n_subsamples: Number of samples to calculate the parameters. This is at least the number of features (plus 1 if fit_intercept=True) and the number of samples as a maximum. A lower number leads to a higher breakdown point and a low efficiency while a high number leads to a low breakdown point and a high efficiency. If None, take the minimum number of subsamples leading to maximal robustness. If n_subsamples is set to n_samples, Theil-Sen is identical to least squares.
max_iter: Maximum number of iterations for the calculation of spatial median.
tol: Tolerance when calculating spatial median.
random_state: A random number generator instance to define the state of the random permutations generator.
n_jobs: Number of CPUs to use during the cross validation. If -1, use all the CPUs.
verbose: Verbose mode when fitting the model.

Attributes

coef_: Coefficients of the regression model (median of distribution).
intercept_: Estimated intercept of regression model.
breakdown_: Approximated breakdown point.
n_iter_: Number of iterations needed for the spatial median.
n_subpopulation_: Number of combinations taken into account from ‘n choose k’, where n is the number of samples and k is the number of subsamples.

References

Theil-Sen Estimators in a Multiple Linear Regression Model, 2009 Xin Dang, Hanxiang Peng, Xueqin Wang and Heping Zhang http://www.math.iupui.edu/~hpeng/MTSE_0908.pdf

POSSIBLE NODE NAMES:
	TheilSenRegressorSklearnNode TheilSenRegressorSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.TruncatedSVDTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.TruncatedSVDTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Dimensionality reduction using truncated SVD (aka LSA).

This node has been automatically generated by wrapping the sklearn.decomposition.truncated_svd.TruncatedSVD class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This transformer performs linear dimensionality reduction by means of truncated singular value decomposition (SVD). It is very similar to PCA, but operates on sample vectors directly, instead of on a covariance matrix. This means it can work with scipy.sparse matrices efficiently.

In particular, truncated SVD works on term count/tf-idf matrices as returned by the vectorizers in sklearn.feature_extraction.text. In that context, it is known as latent semantic analysis (LSA).

This estimator supports two algorithm: a fast randomized SVD solver, and a “naive” algorithm that uses ARPACK as an eigensolver on (X * X.T) or (X.T * X), whichever is more efficient.

Read more in the User Guide.

Parameters

n_components: Desired dimensionality of output data. Must be strictly less than the number of features. The default value is useful for visualisation. For LSA, a value of 100 is recommended.
algorithm: SVD solver to use. Either “arpack” for the ARPACK wrapper in SciPy (scipy.sparse.linalg.svds), or “randomized” for the randomized algorithm due to Halko (2009).
n_iter: Number of iterations for randomized SVD solver. Not used by ARPACK.
random_state: (Seed for) pseudo-random number generator. If not given, the numpy.random singleton is used.
tol: Tolerance for ARPACK. 0 means machine precision. Ignored by randomized SVD solver.

Attributes

components_ : array, shape (n_components, n_features)

explained_variance_ratio_: Percentage of variance explained by each of the selected components.
explained_variance_: The variance of the training samples transformed by a projection to each component.

Examples

>>> from sklearn.decomposition import TruncatedSVD
>>> from sklearn.random_projection import sparse_random_matrix
>>> X = sparse_random_matrix(100, 100, density=0.01, random_state=42)
>>> svd = TruncatedSVD(n_components=5, random_state=42)
>>> svd.fit(X) 
TruncatedSVD(algorithm='randomized', n_components=5, n_iter=5,
        random_state=42, tol=0.0)
>>> print(svd.explained_variance_ratio_) 
[ 0.0782... 0.0552... 0.0544... 0.0499... 0.0413...]
>>> print(svd.explained_variance_ratio_.sum()) 
0.279...

See also

PCA RandomizedPCA

References

Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions Halko, et al., 2009 (arXiv:909) http://arxiv.org/pdf/0909.4061

Notes

SVD suffers from a problem called “sign indeterminancy”, which means the sign of the components_ and the output from transform depend on the algorithm and random state. To work around this, fit instances of this class to data once, then keep the instance around to do transformations.

POSSIBLE NODE NAMES:
	TruncatedSVDTransformerSklearnNode TruncatedSVDTransformerSklearn
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.VarianceThresholdTransformerSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.VarianceThresholdTransformerSklearnNode(input_dim=None, output_dim=None, dtype=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Feature selector that removes all low-variance features.

This node has been automatically generated by wrapping the sklearn.feature_selection.variance_threshold.VarianceThreshold class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

This feature selection algorithm looks only at the features (X), not the desired outputs (y), and can thus be used for unsupervised learning.

Read more in the User Guide.

Parameters

threshold: Features with a training-set variance lower than this threshold will be removed. The default is to keep all features with non-zero variance, i.e. remove the features that have the same value in all samples.

Attributes

variances_: Variances of individual features.

Examples

The following dataset has integer features, two of which are the same in every sample. These are removed with the default setting for threshold:

>>> X = [[0, 2, 0, 3], [0, 1, 4, 3], [0, 1, 1, 3]]
>>> selector = VarianceThreshold()
>>> selector.fit_transform(X)
array([[2, 0],
       [1, 4],
       [1, 1]])

POSSIBLE NODE NAMES:
	VarianceThresholdTransformerSklearn VarianceThresholdTransformerSklearnNode
POSSIBLE INPUT TYPES:
	FeatureVector

`pySPACE.missions.nodes.scikit_nodes.VotingClassifierSklearnNode`¶

class pySPACE.missions.nodes.scikit_nodes.VotingClassifierSklearnNode(input_dim=None, output_dim=None, dtype=None, class_labels=None, **kwargs)¶

Bases: pySPACE.missions.nodes.base_node.BaseNode

Soft Voting/Majority Rule classifier for unfitted estimators.

This node has been automatically generated by wrapping the sklearn.ensemble.voting_classifier.VotingClassifier class from the sklearn library. The wrapped instance can be accessed through the scikit_alg attribute.

New in version 0.17.

Read more in the User Guide.

Parameters

estimators: Invoking the fit method on the VotingClassifier will fit clones of those original estimators that will be stored in the class attribute self.estimators_.
voting: If ‘hard’, uses predicted class labels for majority rule voting. Else if ‘soft’, predicts the class label based on the argmax of the sums of the predicted probalities, which is recommended for an ensemble of well-calibrated classifiers.
weights: Sequence of weights (float or int) to weight the occurances of predicted class labels (hard voting) or class probabilities before averaging (soft voting). Uses uniform weights if None.

Attributes

classes_ : array-like, shape = [n_predictions]

Examples

>>> import numpy as np
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.naive_bayes import GaussianNB
>>> from sklearn.ensemble import RandomForestClassifier
>>> clf1 = LogisticRegression(random_state=1)
>>> clf2 = RandomForestClassifier(random_state=1)
>>> clf3 = GaussianNB()
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> y = np.array([1, 1, 1, 2, 2, 2])
>>> eclf1 = VotingClassifier(estimators=[
...         ('lr', clf1), ('rf', clf2), ('gnb', clf3)], voting='hard')
>>> eclf1 = eclf1.fit(X, y)
>>> print(eclf1.predict(X))
[1 1 1 2 2 2]
>>> eclf2 = VotingClassifier(estimators=[
...         ('lr', clf1), ('rf', clf2), ('gnb', clf3)],
...         voting='soft')
>>> eclf2 = eclf2.fit(X, y)
>>> print(eclf2.predict(X))
[1 1 1 2 2 2]
>>> eclf3 = VotingClassifier(estimators=[
...        ('lr', clf1), ('rf', clf2), ('gnb', clf3)],
...        voting='soft', weights=[2,1,1])
>>> eclf3 = eclf3.fit(X, y)
>>> print(eclf3.predict(X))
[1 1 1 2 2 2]
>>>

POSSIBLE NODE NAMES:
	VotingClassifierSklearnNode VotingClassifierSklearn
POSSIBLE INPUT TYPES:
	FeatureVector